Let\'s say I have an array of this format
X Y Z
A 1 0
A 2 1
B 1 1
B 2 1
B 1 0
I want to find the frequency of X and the frequen
Here's a data.table
way:
require(data.table)
DT <- data.table(dat)
DT[,nx:=.N,by=X][,nxy:=.N,by=list(X,Y)]
That last step created the two columns:
DT
# X Y Z nx nxy
# 1: A 1 0 2 1
# 2: A 2 1 2 1
# 3: B 1 1 3 2
# 4: B 2 1 3 1
# 5: B 1 0 3 2
And it could have been written in two lines instead of one:
DT[,nx:=.N,by=X]
DT[,nxy:=.N,by=list(X,Y)]
# Assuming your data frame is called df:
df$Fx <- ave(as.numeric(as.factor(df$X)), df$X, FUN = length)
df2 <- as.data.frame(with(df, table(X, Y)), responseName = "Fyx")
df3 <- merge(df, df2)
# please see @thelatemail's clean `ave`-only calculation of 'Fyx'
df3
# X Y Z Fx Fyx
# 1 A 1 0 2 1
# 2 A 2 1 2 1
# 3 B 1 1 3 2
# 4 B 1 0 3 2
# 5 B 2 1 3 1
# And a ddply alternative
library(plyr)
df2 <- ddply(.data = df, .variables = .(X), mutate,
Fx = length(X))
ddply(.data = df2, .variables = .(X, Y), mutate,
Fxy = length(Y))
Using ave
and assuming your data is dat
dat$Fx <- with(dat,ave(Y,list(X),FUN=length))
dat$Fyx <- with(dat,ave(Y,list(X,Y),FUN=length))
Result:
X Y Z Fx Fyx
1 A 1 0 2 1
2 A 2 1 2 1
3 B 1 1 3 2
4 B 2 1 3 1
5 B 1 0 3 2
If the data doesn't have a numeric column for ave
to work on, then:
dat$Fx <- with(dat,ave(seq_len(nrow(dat)),list(X),FUN=length))
dat$Fyx <- with(dat,ave(seq_len(nrow(dat)),list(X,Y),FUN=length))