问题
This is a follow up question to this one. In the original question the OP wanted to perform bootstrap on two columns x1
and x2
that are fixed:
set.seed(1000)
data <- as.data.table(list(x1 = runif(200), x2 = runif(200), group = runif(200)>0.5))
stat <- function(x, i) {x[i, c(m1 = mean(x1), m2 = mean(x2))]}
data[, list(list(boot(.SD, stat, R = 10))), by = group]$V1
However, I think this problem can be nicely extended to handle any number of columns by treating them as groups. For instance, lets use the iris
dataset. Say I want to calculate bootstrap mean for all four dimensions for each species. I can use melt to flip the data and then use the Species
, variable
combination to get the mean in one go - I think this approach will scale well.
data(iris)
iris = data.table(iris)
iris[,mean(Sepal.Length),by=Species]
iris[,ID:=.N,]
iris_deep = melt(iris
,id.vars = c("ID","Species")
,measure.vars = c("Sepal.Length","Sepal.Width","Petal.Length","Petal.Width"))
#define a mean bootstrap function
stat <- function(x, i) {x[i, m=mean(value),]}
iris_deep[, list(list(boot(.SD, stat, R = 100))), by = list(Species,variable)]$V1
Here is my attempt at doing this. However the bootstrapping part does not seem to be working. As R throws the following error:
Error in mean(value) : object 'value' not found
Can someone please take a crack at this?
回答1:
I tried this (with added braces enclosing m=mean(value)
) and it appears to work:
stat <- function(x, i) {x[i, (m=mean(value))]}
来源:https://stackoverflow.com/questions/38989932/bootstrapping-multiple-columns-in-data-table-in-a-scalable-fashion-r