I am using the dcast
function in R to turn a long-format dataset into a wide-format dataset. I have an ID
number, a categorical variable (CAT
I added some extra data lines to clarify some parts of this. But the gist is that you just need to put SEX
on the left hand side (i.e., of ~
):
PC2 <- read.table(text="ID CAT AMT SEX
1 A 46 Female
1 B 22 Female
1 C 31 Female
2 A 17 Male
2 B 25 Male
2 C 44 Male
3 A 47 Female
3 B 27 Female
3 C 37 Female
4 A 17 Male
4 A 17 Male
4 B 22 Male
4 B NA Male
4 C 44 Male", header=T)
library(reshape2)
PC1cast2 <- dcast(PC2, ID+SEX~CAT, value.var='AMT', fun.aggregate=sum,
na.rm=TRUE)
PC1cast2
# ID SEX A B C
# 1 1 Female 46 22 31
# 2 2 Male 17 25 44
# 3 3 Female 47 27 37
# 4 4 Male 34 22 44
In your example data, you only have one instance of each combination and no NA
s, so the fun.aggregate=sum, na.rm=TRUE
doesn't do anything. When some are duplicated (e.g., there are two 4 A
s and two 4 B
s), the values are summed, but the NA
s are dropped first. Make sure that is what you want.