How can I, in R calculate the overall variance and the variance for each group from a dataset that looks like this (for example):
Group Count Value
A 3
One option is using data.table
. Convert the data.frame to data.table (setDT
) and get the var
of "Value" and sum
of "Count" by "Group".
library(data.table)
setDT(df1)[, list(GroupVariance=var(rep(Value, Count)),
TotalCount=sum(Count)) , by = Group]
# Group GroupVariance TotalCount
#1: A 2.7 5
#2: B 4.0 4
a similar way using dplyr
is
library(dplyr)
group_by(df1, Group) %>%
summarise(GroupVariance=var(rep(Value,Count)), TotalCount=sum(Count))
# Group GroupVariance TotalCount
#1 A 2.7 5
#2 B 4.0 4