How to compute P-value and standard error from correlation analysis of R's cor()

后端未结

关注

 2  2132

佛祖请我去吃肉 2021-02-13 11:04

I have data that contain 54 samples for each condition (x and y). I have computed the correlation the following way:

> dat <- read.table(\"http://dpaste.co


      
      
        
          2条回答        

        
                    
            
            
                         
                
              
              
                
                   孤独总比滥情好
                                             
                
                
                (楼主)
            
              
              
                2021-02-13 11:42
              

            
            
                        
I think that what you're looking for is simply the cor.test() function, which will return everything you're looking for except for the standard error of correlation. However, as you can see, the formula for that is very straightforward, and if you use cor.test, you have all the inputs required to calculate it. 

Using the data from the example (so you can compare it yourself with the results on page 14.6):

> cor.test(mydf$X, mydf$Y)

    Pearson's product-moment correlation

data:  mydf$X and mydf$Y
t = -5.0867, df = 10, p-value = 0.0004731
alternative hypothesis: true correlation is not equal to 0
95 percent confidence interval:
 -0.9568189 -0.5371871
sample estimates:
       cor 
-0.8492663 


If you wanted to, you could also create a function like the following to include the standard error of the correlation coefficient.

For convenience, here's the equation:



r = the correlation estimate and n - 2 = degrees of freedom, both of which are readily available in the output above. Thus, a simple function could be:

cor.test.plus <- function(x) {
  list(x, 
       Standard.Error = unname(sqrt((1 - x$estimate^2)/x$parameter)))
}


And use it as follows:

cor.test.plus(cor.test(mydf$X, mydf$Y))


Here, "mydf" is defined as:

mydf <- structure(list(Neighborhood = c("Fair Oaks", "Strandwood", "Walnut Acres",
  "Discov. Bay", "Belshaw", "Kennedy", "Cassell", "Miner", "Sedgewick", 
  "Sakamoto", "Toyon", "Lietz"), X = c(50L, 11L, 2L, 19L, 26L, 
  73L, 81L, 51L, 11L, 2L, 19L, 25L), Y = c(22.1, 35.9, 57.9, 22.2, 
  42.4, 5.8, 3.6, 21.4, 55.2, 33.3, 32.4, 38.4)), .Names = c("Neighborhood", 
  "X", "Y"), class = "data.frame", row.names = c(NA, -12L))

    
             
                                                        
            
            
              
                
                0
              
                   
                
               讨论(0)
              
                                                  
              
              
                          
             
       
          
              
                                       
     查看其它2个回答


            
                         
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
                              			
        
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复