Name columns within aggregate in R

前端 未结 4 1778
逝去的感伤
逝去的感伤 2020-12-13 06:02

I know I can *re*name columns after I aggregate the data:

blubb <- aggregate(dat$two ~ dat$one, ...)
colnames(blubb) <- c(\"One\", \"Two\")
         


        
4条回答
  •  时光说笑
    2020-12-13 06:17

    In case you prefere writing aggreagtes as formula the documentation shows the usage of cbind. And cbind allows you to name its arguments, which are used by aggregate.

    blubb <- aggregate(cbind(Two = dat$two) ~ cbind(One = dat$one), ...)
    

    Aggregation of more than one column by more than one grouping factor could be done like:

    blubb <- aggregate(cbind(x = varX, y = varY, varZ) ~ cbind(a = facA) + cbind(b = facB) + facC, data=dat, FUN=sum)
    

    and if you want to use more than one function:

    aggregate(cbind(cases=ncases, ncontrols) ~ cbind(alc=alcgp) + tobgp, data = esoph, FUN = function(x) c("mean" = mean(x), "median" = median(x)))
    
    #   alc    tobgp cases.mean cases.median ncontrols.mean ncontrols.median
    #1    1 0-9g/day  1.5000000    1.0000000      43.500000        47.000000
    #2    2 0-9g/day  5.6666667    4.0000000      29.833333        34.500000
    #...
    

    which adds to the colname the used aggregate-function.

    But cbind replaces factors by their internal codes. To avoid this you can use:

    with(esoph, aggregate(data.frame(cases=ncases, ncontrols), data.frame(alc=alcgp, tobgp), FUN = function(x) c("mean" = mean(x), "median" = median(x))))
    
    #         alc    tobgp cases.mean cases.median ncontrols.mean ncontrols.median
    #1  0-39g/day 0-9g/day  1.5000000    1.0000000      43.500000        47.000000
    #2      40-79 0-9g/day  5.6666667    4.0000000      29.833333        34.500000
    #...
    

提交回复
热议问题