Retaining variables in dcast in R

前端 未结 2 747
旧时难觅i
旧时难觅i 2021-01-22 12:41

I am using the dcast function in R to turn a long-format dataset into a wide-format dataset. I have an ID number, a categorical variable (CAT

相关标签:
2条回答
  • 2021-01-22 13:21

    For that, you need to add SEX to the ID side of your formula:

    dcast(PC1, ID + SEX~CAT, value.var='AMT', fun.aggregate=sum, na.rm=TRUE)
    # results in:
    
      ID    SEX  A  B  C
    1  1 Female 46 22 31
    2  2   Male 17 25 44
    

    Things on the left hand side of the formula are kept as-is, things on the right-hand side are cast.

    0 讨论(0)
  • 2021-01-22 13:21

    I added some extra data lines to clarify some parts of this. But the gist is that you just need to put SEX on the left hand side (i.e., of ~):

    PC2 <- read.table(text="ID CAT AMT SEX 
    1  A   46  Female 
    1  B   22  Female 
    1  C   31  Female 
    2  A   17  Male 
    2  B   25  Male 
    2  C   44  Male
    3  A   47  Female 
    3  B   27  Female 
    3  C   37  Female 
    4  A   17  Male 
    4  A   17  Male 
    4  B   22  Male 
    4  B   NA  Male 
    4  C   44  Male", header=T)
    
    library(reshape2)
    PC1cast2 <- dcast(PC2, ID+SEX~CAT, value.var='AMT', fun.aggregate=sum, 
                      na.rm=TRUE)
    PC1cast2
    #   ID    SEX  A  B  C
    # 1  1 Female 46 22 31
    # 2  2   Male 17 25 44
    # 3  3 Female 47 27 37
    # 4  4   Male 34 22 44
    

    In your example data, you only have one instance of each combination and no NAs, so the fun.aggregate=sum, na.rm=TRUE doesn't do anything. When some are duplicated (e.g., there are two 4 As and two 4 Bs), the values are summed, but the NAs are dropped first. Make sure that is what you want.

    0 讨论(0)
自定义标题
段落格式
字体
字号
代码语言
提交回复
热议问题