问题
I have a dataset with 70 cases (participants in a study). Is there a function that can calculate the mean of these 70 cases such that each individual case is not included in the analysis. This would look like:
"mean for case x = (value(1) + ... value(n) - value(x))/n"
Any information will help.
回答1:
You could just do what you've suggested and remove each case from the total:
x <- c(1:10)
(sum(x) - x) / (length(x) - 1)
#[1] 6.000000 5.888889 5.777778 5.666667 5.555556 5.444444 5.333333 5.222222 5.111111 5.000000
mean(2:10)
#[1] 6
mean(1:9)
#[1] 5
EDIT: Updated to try to address followup question in comments:
set.seed(123)
df <- data.frame(group = rep(letters[1:3], each = 3),
value = rnorm(9), stringsAsFactors = F)
df
#group value
#1 a -0.56047565
#2 a -0.23017749
#3 a 1.55870831
#4 b 0.07050839
#5 b 0.12928774
#6 b 1.71506499
#7 c 0.46091621
#8 c -1.26506123
#9 c -0.68685285
df$loo_mean <- unlist(tapply(df$value, df$group,
function(x) (sum(x) - x) / (length(x) - 1)))
df
#group value loo_mean
#1 a -0.56047565 0.66426541
#2 a -0.23017749 0.49911633
#3 a 1.55870831 -0.39532657
#4 b 0.07050839 0.92217636
#5 b 0.12928774 0.89278669
#6 b 1.71506499 0.09989806
#7 c 0.46091621 -0.97595704
#8 c -1.26506123 -0.11296832
#9 c -0.68685285 -0.40207251
mean(df$value[2:3])
#[1] 0.6642654
mean(df$value[c(7,9)])
#[1] -0.1129683
回答2:
Here's a vectorised approach, to avoid averaging each subset one at a time:
x <- runif(70)
sapply(seq_along(x), function(i) mean(x[-i]))
来源:https://stackoverflow.com/questions/22901826/calculating-a-group-mean-while-excluding-each-cases-individual-value