I want to calculate mean
(or any other summary statistics of length one, e.g. min
, max
, length
, sum
) of a nu
Have a look at the ave
function. Something like
df$grp.mean.values <- ave(df$value, df$group)
If you want to use ave
to calculate something else per group, you need to specify FUN = your-desired-function
, e.g. FUN = min
:
df$grp.min <- ave(df$value, df$group, FUN = min)
One option is to use plyr
. ddply
expects a data.frame
(the first d) and returns a data.frame
(the second d). Other XXply functions work in a similar way; i.e. ldply
expects a list
and returns a data.frame
, dlply
does the opposite...and so on and so forth. The second argument is the grouping variable(s). The third argument is the function we want to compute for each group.
require(plyr)
ddply(dat, "group", transform, grp.mean.values = mean(value))
id group value grp.mean.values
1 1 a 10 15
2 2 a 20 15
3 3 b 100 150
4 4 b 200 150
Here is another option using base functions aggregate
and merge
:
merge(x, aggregate(value ~ group, data = x, mean),
by = "group", suffixes = c("", "mean"))
group id value.x value.y
1 a 1 10 15
2 a 2 20 15
3 b 3 100 150
4 b 4 200 150
You can get "better" column names with suffixes
:
merge(x, aggregate(value ~ group, data = x, mean),
by = "group", suffixes = c("", ".mean"))
group id value value.mean
1 a 1 10 15
2 a 2 20 15
3 b 3 100 150
4 b 4 200 150
You may do this in dplyr
using mutate
:
library(dplyr)
df %>%
group_by(group) %>%
mutate(grp.mean.values = mean(value))
...or use data.table
to assign the new column by reference (:=
):
library(data.table)
setDT(df)[ , grp.mean.values := mean(value), by = group]