Bootstrapped correlation in R

前端 未结 1 1657
猫巷女王i
猫巷女王i 2021-01-27 08:01

I am trying to do a bootstrapped correlation in R. I have two variables Var1 and Var2 and I want to get the bootstrapped p.value of the Pearson correlation.

my          


        
相关标签:
1条回答
  • 2021-01-27 08:31

    If you want to bootstrap your correlation test, you only need to return the correlation coefficient from your bootstrap statistic function. Bootstrapping the p-value of the correlation test is not appropriate in this case, as you ignore the directionality of the correlation test.

    Check this question on CrossValidated for some nice answers on performing bootstrap hypothesis tests: https://stats.stackexchange.com/questions/20701/computing-p-value-using-bootstrap-with-r

    library("boot")
    data <- read.csv("~/Documents/stack/tmp.csv", header = FALSE)
    colnames(data) <- c("x", "y")
    
    data <- as.data.frame(data)
    x <- data$Var1
    y <- data$Var2
    dat <- data.frame(x,y)
    
    set.seed(1)
    
    b3 <- boot(data, 
      statistic = function(data, i) {
        cor(data[i, "x"], data[i, "y"], method='pearson')
      },
      R = 1000
    )
    b3
    #> 
    #> ORDINARY NONPARAMETRIC BOOTSTRAP
    #> 
    #> 
    #> Call:
    #> boot(data = data, statistic = function(data, i) {
    #>     cor(data[i, "x"], data[i, "y"], method = "pearson")
    #> }, R = 1000)
    #> 
    #> 
    #> Bootstrap Statistics :
    #>      original        bias    std. error
    #> t1* 0.1279691 -0.0004316781    0.314056
    boot.ci(b3, type = c("norm", "basic", "perc", "bca")) #bootstrapped CI. 
    #> BOOTSTRAP CONFIDENCE INTERVAL CALCULATIONS
    #> Based on 1000 bootstrap replicates
    #> 
    #> CALL : 
    #> boot.ci(boot.out = b3, type = c("norm", "basic", "perc", "bca"))
    #> 
    #> Intervals : 
    #> Level      Normal              Basic         
    #> 95%   (-0.4871,  0.7439 )   (-0.4216,  0.7784 )  
    #> 
    #> Level     Percentile            BCa          
    #> 95%   (-0.5225,  0.6775 )   (-0.5559,  0.6484 )  
    #> Calculations and Intervals on Original Scale
    
    plot(density(b3$t))
    abline(v = 0, lty = "dashed", col = "grey60")
    

    In this case without a p-value it's quite safe to say that most of the mass of the sampling distribution is very close to zero.

    0 讨论(0)
提交回复
热议问题