Bootstrapped p-value for a correlation coefficient on R

陌路散爱 提交于 2019-12-21 22:39:25

问题


On R, I used the boostrap method to get a correlation coefficient estimation and the confidence intervals. To get the p-value, I thought, I can calculate the proportion of the confidence intervals which do not contain zero. But this is not the solution.

How can I get the p-value in this case ?

I am using cor.test to get the coefficient estimation. cor.test may also gives me the p-value from every test. But how can I get the bootstrapped p-value ?

Thank you very much !

Below an example :

n=30
data = matrix (data = c (rnorm (n), rnorm (n),rnorm (n), rpois(n,1), 
rbinom(n,1,0.6)), nrow =  n, byrow = F)
data= as.data.frame(data)
z1  = replicate( Brep, sample(1:dim(data)[1], dim(data)[1], replace = T))
res = do.call  ( rbind, apply(z1, 2, function(x){ res=cor.test(data$V1[x], data$V2[x]) ; return ((list(res$p.value,res$estimate))) }))

 coeffcorr  = mean(unlist(res[,2]), na.rm = T) #bootstrapped coefficient
 confInter1 = quantile(unlist(res[,2]), c(0.025, 0.975), na.rm = T)[1] #confidence interval 1
 confInter2 = quantile(unlist(res[,2]), c(0.025, 0.975), na.rm = T)[2] #confidence interval 2  
 p.value    =  mean    (unlist(res[,1]), na.rm = T )  # pvalue

回答1:


The standard way of bootstrapping in R is to use base package boot. You start by defining the bootstrap function, a function that takes two arguments, the dataset and an index into the dataset. This is function bootCorTest below. In the functionyou subset the dataset selecting just the rows defined by the index.

The rest is straightforward.

library(boot)

bootCorTest <- function(data, i){
    d <- data[i, ]
    cor.test(d$x, d$y)$p.value
}


# First dataset in help("cor.test")
x <- c(44.4, 45.9, 41.9, 53.3, 44.7, 44.1, 50.7, 45.2, 60.1)
y <- c( 2.6,  3.1,  2.5,  5.0,  3.6,  4.0,  5.2,  2.8,  3.8)
dat <- data.frame(x, y)

b <- boot(dat, bootCorTest, R = 1000)

b$t0
#[1] 0.10817

mean(b$t)
#[1] 0.134634

boot.ci(b)

For more information on the results of functions boot and boot.ci see their respective help pages.

EDIT.

If you want to return several values from the boot statistic function bootCorTest, you should return a vector. In the following case it returns a named vector with the values required.

Note that I set the RNG seed, to make the results reproducible. I should already have done it above.

set.seed(7612)    # Make the results reproducible

bootCorTest2 <- function(data, i){
    d <- data[i, ]
    res <- cor.test(d$x, d$y)
    c(stat = res$statistic, p.value = res$p.value)
}


b2 <- boot(dat, bootCorTest, R = 1000)

b2$t0
#  stat.t  p.value 
#1.841083 0.108173


colMeans(b2$t)
#[1] 2.869479 0.133857


来源:https://stackoverflow.com/questions/51761415/bootstrapped-p-value-for-a-correlation-coefficient-on-r

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!