Name columns within aggregate in R

前端未结

关注

 4  1781

逝去的感伤 2020-12-13 06:02

I know I can *re*name columns after I aggregate the data:

blubb <- aggregate(dat$two ~ dat$one, ...)
colnames(blubb) <- c(\"One\", \"Two\")

4条回答

时光说笑 (楼主)

2020-12-13 06:17

In case you prefere writing aggreagtes as formula the documentation shows the usage of cbind. And cbind allows you to name its arguments, which are used by aggregate.

blubb <- aggregate(cbind(Two = dat$two) ~ cbind(One = dat$one), ...)

Aggregation of more than one column by more than one grouping factor could be done like:

blubb <- aggregate(cbind(x = varX, y = varY, varZ) ~ cbind(a = facA) + cbind(b = facB) + facC, data=dat, FUN=sum)

and if you want to use more than one function:

aggregate(cbind(cases=ncases, ncontrols) ~ cbind(alc=alcgp) + tobgp, data = esoph, FUN = function(x) c("mean" = mean(x), "median" = median(x)))

#   alc    tobgp cases.mean cases.median ncontrols.mean ncontrols.median
#1    1 0-9g/day  1.5000000    1.0000000      43.500000        47.000000
#2    2 0-9g/day  5.6666667    4.0000000      29.833333        34.500000
#...

which adds to the colname the used aggregate-function.

But cbind replaces factors by their internal codes. To avoid this you can use:

with(esoph, aggregate(data.frame(cases=ncases, ncontrols), data.frame(alc=alcgp, tobgp), FUN = function(x) c("mean" = mean(x), "median" = median(x))))

#         alc    tobgp cases.mean cases.median ncontrols.mean ncontrols.median
#1  0-39g/day 0-9g/day  1.5000000    1.0000000      43.500000        47.000000
#2      40-79 0-9g/day  5.6666667    4.0000000      29.833333        34.500000
#...

0 讨论(0)

查看其它4个回答