I want to use aggregate function on a date frame but sum one column and take average of another column.
Here is an example data frame
Manager Categ
There are several ways to do this. Here are some that I like (all assuming we're starting with a data.frame
named "mydf"):
Using ave
and unique
unique(within(mydf, {
Amount <- ave(Amount, Manager, FUN = sum)
SqFt <- ave(SqFt, Manager, FUN = mean)
rm(Category)
}))
# Manager Amount SqFt
# 1 Joe 200 500
# 2 Alice 325 700
Using data.table
:
library(data.table)
DT <- data.table(mydf)
DT[, list(Amount = sum(Amount), SqFt = mean(SqFt)), by = "Manager"]
# Manager Amount SqFt
# 1: Joe 200 500
# 2: Alice 325 700
Using "sqldf":
library(sqldf)
sqldf("select Manager, sum(Amount) `Amount`,
avg(SqFt) `SqFt` from mydf group by Manager")
Using aggregate
and merge
:
merge(aggregate(Amount ~ Manager, mydf, sum),
aggregate(SqFt ~ Manager, mydf, mean))
You can use the summarise
function with ddply
in the plyr
package:
library(plyr)
ddply(mtcars,.(cyl),summarise,TotalWt=sum(wt),AveHP=mean(hp))
cyl TotalWt AveHP
1 4 25.143 82.63636
2 6 21.820 122.28571
3 8 55.989 209.21429