statistics-bootstrap

Bootstrapping confidence intervals in R: BCa method and prescribed resamples

只谈情不闲聊 提交于 2019-12-24 16:22:18
问题 I would like to estimate confidence intervals in R using the BCa method (correcting for bias and asymmetric boot distribution). However, my resamples are not ''standard resamples'' but something more complicated and thus I would like to generate them separately and subsequently apply a BCa algorithm. As far as I see there exist the function "BootBCa" as well as the ''boot'' package in R. However, in both cases resamples are automatically generated. Is there any way in R to firstly prescribe

R - dplyr bootstrap issue

余生长醉 提交于 2019-12-23 17:50:17
问题 I have an issue understanding how to use the dplyr bootstrap function properly. What I want is to generate a bootstrap distribution from two randomly assigned groups and compute the difference in means, like this for example : library(dplyr) library(broom) data(mtcars) mtcars %>% mutate(treat = sample(c(0, 1), 32, replace = T)) %>% group_by(treat) %>% summarise(m = mean(disp)) %>% summarise(m = m[treat == 1] - m[treat == 0]) The issue is that I need to repeat this operation 100 , 1000 , or

R - dplyr bootstrap issue

谁说胖子不能爱 提交于 2019-12-23 17:42:45
问题 I have an issue understanding how to use the dplyr bootstrap function properly. What I want is to generate a bootstrap distribution from two randomly assigned groups and compute the difference in means, like this for example : library(dplyr) library(broom) data(mtcars) mtcars %>% mutate(treat = sample(c(0, 1), 32, replace = T)) %>% group_by(treat) %>% summarise(m = mean(disp)) %>% summarise(m = m[treat == 1] - m[treat == 0]) The issue is that I need to repeat this operation 100 , 1000 , or

How to make a sample from the empirical distribution function

懵懂的女人 提交于 2019-12-23 15:28:52
问题 I'm trying to implement the nonparametric bootstrapping on Python. It requires to take a sample, build an empirical distribution function from it and then to generate a bunch of samples from this edf. How can I do it? In scipy I found only how to make your own distribution function if you know the exact formula describing it, but I have only an edf. 回答1: The edf you get by sorting the samples: N = samples.size ss = np.sort(samples) # these are the x-values of the edf # the y-values are 1/(2N)

How to extract coefficients from elrm summary output

你离开我真会死。 提交于 2019-12-23 12:04:21
问题 I ran exact logistic regression on my data set using the package elrm I am comparing it to ordinary logistic regression. I was able to run a bootstrap on the ordinary logistic regression, the statistics of interest I pulled were the estimated coefficient and p-value. However, I cannot run my elrm bootstrap because I can't pull the coefficients I need from the output. With my data the summary gives a print out: Results: estimate p-value p-value_se mc_size M 0.15116 0.06594 0.00443 49000 95%

Bootstrap p value significantly different from t-test p value

醉酒当歌 提交于 2019-12-23 04:55:26
问题 I am trying to figure out whether there is significant difference between two sample sets by calculating the p-value through bootstrapping and the t-test. However, I get p = 0.49 when I do bootstrapping and 7.015e-11 when I use the t-test. I'm quite confused as to why there is such a large difference between the two p-values. Below is the code for my bootstrap: diff <- function(data, k) { s = data[, 1:25] n = data[, 26:100] mean <- tapply(s[, k], n[, k], mean) mean[1] - mean[2] } b = boot(d,

Speeding up time series simulation (for bootstrap)

妖精的绣舞 提交于 2019-12-22 10:28:49
问题 I need to run a bootstrap on a time series with non-standard dependence. So to do this I need to create a function that simulates the time series by making time by time adjustments. testing<-function(){ sampleData<-as.zoo(data.frame(index=1:1000,vol=(rnorm(1000))^2,x=NA)) sampleData[,"x"]<-sampleData[,"vol"]+rnorm(1000) #treat this is completely exognenous and unknown in connection to vol sampleData<-cbind(sampleData,mean=rollmean(sampleData[,"vol"],k=3,align="right")) sampleData<-cbind

Bootstrapped p-value for a correlation coefficient on R

陌路散爱 提交于 2019-12-21 22:39:25
问题 On R , I used the boostrap method to get a correlation coefficient estimation and the confidence intervals. To get the p-value, I thought, I can calculate the proportion of the confidence intervals which do not contain zero. But this is not the solution. How can I get the p-value in this case ? I am using cor.test to get the coefficient estimation. cor.test may also gives me the p-value from every test. But how can I get the bootstrapped p-value ? Thank you very much ! Below an example : n=30

function works (boot.stepAIC ) but throws an error inside another function - environment issue?

我只是一个虾纸丫 提交于 2019-12-21 17:07:49
问题 I realized a strange behavior today with in my R code. I tried a package {boot.StepAIC} which includes a bootstrap function for the results of the stepwise regression with the AIC. However I do not think the statistical background is here the problem (I hope so). I can use the function at the top level of R. This is my example code. require(MASS) require(boot.StepAIC) n<-100 x<-rnorm(n); y<-rnorm(n,sd=2); z<-rnorm(n,sd=3); res<-x+y+z+rnorm(n,sd=0.1) dat.test<-as.data.frame(cbind(x,y,z,res))

R bootstrap statistics by group for big data

烂漫一生 提交于 2019-12-21 02:36:22
问题 I want to bootstrap a data set that has groups in it. A simple scenario would be bootstrapping simple means: data <- as.data.table(list(x1 = runif(200), x2 = runif(200), group = runif(200)>0.5)) stat <- function(x, i) {x[i, c(m1 = mean(x1), m2 = mean(x2)), by = "group"]} boot(data, stat, R = 10) This gives me the error incorrect number of subscripts on matrix , because of by = "group" part. I managed to solve it using subsetting, but don't like this solution. Is there simpler way to make this