Retaining variables in dcast in R

前端 未结 2 745
旧时难觅i
旧时难觅i 2021-01-22 12:41

I am using the dcast function in R to turn a long-format dataset into a wide-format dataset. I have an ID number, a categorical variable (CAT

2条回答
  •  旧时难觅i
    2021-01-22 13:21

    I added some extra data lines to clarify some parts of this. But the gist is that you just need to put SEX on the left hand side (i.e., of ~):

    PC2 <- read.table(text="ID CAT AMT SEX 
    1  A   46  Female 
    1  B   22  Female 
    1  C   31  Female 
    2  A   17  Male 
    2  B   25  Male 
    2  C   44  Male
    3  A   47  Female 
    3  B   27  Female 
    3  C   37  Female 
    4  A   17  Male 
    4  A   17  Male 
    4  B   22  Male 
    4  B   NA  Male 
    4  C   44  Male", header=T)
    
    library(reshape2)
    PC1cast2 <- dcast(PC2, ID+SEX~CAT, value.var='AMT', fun.aggregate=sum, 
                      na.rm=TRUE)
    PC1cast2
    #   ID    SEX  A  B  C
    # 1  1 Female 46 22 31
    # 2  2   Male 17 25 44
    # 3  3 Female 47 27 37
    # 4  4   Male 34 22 44
    

    In your example data, you only have one instance of each combination and no NAs, so the fun.aggregate=sum, na.rm=TRUE doesn't do anything. When some are duplicated (e.g., there are two 4 As and two 4 Bs), the values are summed, but the NAs are dropped first. Make sure that is what you want.

提交回复
热议问题