Bootstrapped correlation in R

前端未结

关注

 1  1657

I am trying to do a bootstrapped correlation in R. I have two variables Var1 and Var2 and I want to get the bootstrapped p.value of the Pearson correlation.

my


                      
              相关标签:


      
      
        
          1条回答        

        
                         				            
            
           
            
                              
                
              
              
                
                  遇见更好的自我        
                
              
                            
                2021-01-27 08:31
              
            
            
                                                                       
If you want to bootstrap your correlation test, you only need to return the correlation coefficient from your bootstrap statistic function. Bootstrapping the p-value of the correlation test is not appropriate in this case, as you ignore the directionality of the correlation test.

Check this question on CrossValidated for some nice answers on performing bootstrap hypothesis tests: https://stats.stackexchange.com/questions/20701/computing-p-value-using-bootstrap-with-r

library("boot")
data <- read.csv("~/Documents/stack/tmp.csv", header = FALSE)
colnames(data) <- c("x", "y")

data <- as.data.frame(data)
x <- data$Var1
y <- data$Var2
dat <- data.frame(x,y)

set.seed(1)

b3 <- boot(data, 
  statistic = function(data, i) {
    cor(data[i, "x"], data[i, "y"], method='pearson')
  },
  R = 1000
)
b3
#> 
#> ORDINARY NONPARAMETRIC BOOTSTRAP
#> 
#> 
#> Call:
#> boot(data = data, statistic = function(data, i) {
#>     cor(data[i, "x"], data[i, "y"], method = "pearson")
#> }, R = 1000)
#> 
#> 
#> Bootstrap Statistics :
#>      original        bias    std. error
#> t1* 0.1279691 -0.0004316781    0.314056
boot.ci(b3, type = c("norm", "basic", "perc", "bca")) #bootstrapped CI. 
#> BOOTSTRAP CONFIDENCE INTERVAL CALCULATIONS
#> Based on 1000 bootstrap replicates
#> 
#> CALL : 
#> boot.ci(boot.out = b3, type = c("norm", "basic", "perc", "bca"))
#> 
#> Intervals : 
#> Level      Normal              Basic         
#> 95%   (-0.4871,  0.7439 )   (-0.4216,  0.7784 )  
#> 
#> Level     Percentile            BCa          
#> 95%   (-0.5225,  0.6775 )   (-0.5559,  0.6484 )  
#> Calculations and Intervals on Original Scale

plot(density(b3$t))
abline(v = 0, lty = "dashed", col = "grey60")




In this case without a p-value it's quite safe to say that most of the mass of the sampling distribution is very close to zero.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
                             
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复