I have a dataset whose headers look like so:
PID Time Site Rep Count
I want sum the Count
by Rep
for each P
You should look at the package data.table
for faster aggregation operations on large data frames. For your problem, the solution would look like:
library(data.table)
data_t = data.table(data_tab)
ans = data_t[,list(A = sum(count), B = mean(count)), by = 'PID,Time,Site']
Let's see how fast data.table
is and compare to using dplyr
. Thishis would be roughly the way to do it in dplyr
.
data %>% group_by(PID, Time, Site, Rep) %>%
summarise(totalCount = sum(Count)) %>%
group_by(PID, Time, Site) %>%
summarise(mean(totalCount))
Or perhaps this, depending on exactly how the question is interpreted:
data %>% group_by(PID, Time, Site) %>%
summarise(totalCount = sum(Count), meanCount = mean(Count)
Here is a full example of these alternatives versus @Ramnath proposed answer and the one @David Arenburg proposed in the comments , which I think is equivalent to the second dplyr
statement.
nrow <- 510000
data <- data.frame(PID = sample(letters, nrow, replace = TRUE),
Time = sample(letters, nrow, replace = TRUE),
Site = sample(letters, nrow, replace = TRUE),
Rep = rnorm(nrow),
Count = rpois(nrow, 100))
library(dplyr)
library(data.table)
Rprof(tf1 <- tempfile())
ans <- data %>% group_by(PID, Time, Site, Rep) %>%
summarise(totalCount = sum(Count)) %>%
group_by(PID, Time, Site) %>%
summarise(mean(totalCount))
Rprof()
summaryRprof(tf1) #reports 1.68 sec sampling time
Rprof(tf2 <- tempfile())
ans <- data %>% group_by(PID, Time, Site, Rep) %>%
summarise(total = sum(Count), meanCount = mean(Count))
Rprof()
summaryRprof(tf2) # reports 1.60 seconds
Rprof(tf3 <- tempfile())
data_t = data.table(data)
ans = data_t[,list(A = sum(Count), B = mean(Count)), by = 'PID,Time,Site']
Rprof()
summaryRprof(tf3) #reports 0.06 seconds
Rprof(tf4 <- tempfile())
ans <- setDT(data)[,.(A = sum(Count), B = mean(Count)), by = 'PID,Time,Site']
Rprof()
summaryRprof(tf4) #reports 0.02 seconds
The data table method is much faster, and the setDT
is even faster!