I have the following data frame:
y <- data.frame(group = letters[1:5], a = rnorm(5) , b = rnorm(5), c = rnorm(5), d = rnorm(5) )
How to
You're almost there: you just need to use apply
instead of sapply
, and remove unnecessary columns.
apply(y[-1], 1, function(x) cor(x[1:2], x[3:4])
Of course, the correlation between two length-2 vectors isn't very informative....
You can use apply
to apply a function to each row (or column) of a matrix, array or data.frame.
apply(
y[,-1], # Remove the first column, to ensure that u remains numeric
1, # Apply the function on each row
function(u) cor( u[1:2], u[3:4] )
)
(With just 2 observations, the correlation can only be +1 or -1.)
You could use apply
> apply(y[,-1],1,function(x) cor(x[1:2],x[3:4]))
[1] -1 -1 1 -1 1
Or ddply
(although this might be overkill, and if two rows have the same group
it will do the correlation of columns a&b and c&d for both those rows):
> ddply(y,.(group),function(x) cor(c(x$a,x$b),c(x$c,x$d)))
group V1
1 a -1
2 b -1
3 c 1
4 d -1
5 e 1