R: t-test over all columns

后端未结

关注

 5  734

I tried to do t-test to all columns (two at a time) of my data frame, and extract only the p-value. Here is what I have come up with:

for (i in c(5:525) ) {


                      
              相关标签:


      
      
        
          5条回答        

        
                         				            
            
           
            
                              
                
              
              
                
                  [愿得一人]        
                
              
                            
                2020-11-28 14:29
              
            
            
                                                                       
Assuming your data frame looks something like this:

df = data.frame(a=runif(100),
                b=runif(100),
                c=runif(100),
                d=runif(100),
                e=runif(100),
                f=runif(100))


the the following

tests = lapply(seq(1,length(df),by=2),function(x){t.test(df[,x],df[,x+1])})


will give you tests for each set of columns. Note that this will only give you a t.test for a & b, c & d, and e & f. 
if you wanted a & b, b & c, c & d, d & e, and e & f, then you would have to do:

tests = lapply(seq(1,(length(df)-1)),function(x){t.test(df[,x],df[,x+1])})      


finally if let's say you only want the P values from your tests then you can do this:

pvals = sapply(tests, function(x){x$p.value})


If you are not sure how to work with an object, try typing summary(tests), and str(tests[[1]]) - in this case test is a list of htest objects, and you want to know the structure of the htest object, not necessarily the list.

Hope this helped! 
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  傲寒        
                
              
                            
                2020-11-28 14:33
              
            
            
                                                                       
I would recommend to convert your data frame to long format and use pairwise.t.test with appropriate p.adjust:

> library(reshape2)
> 
> df <- data.frame(a=runif(100),
+          b=runif(100),
+          c=runif(100)+0.5,
+          d=runif(100)+0.5,
+          e=runif(100)+1,
+          f=runif(100)+1)
> 
> d <- melt(df)
Using  as id variables
> 
> pairwise.t.test(d$value, d$variable, p.adjust = "none")

    Pairwise comparisons using t tests with pooled SD 

data:  d$value and d$variable 

  a      b      c      d      e   
b 0.86   -      -      -      -   
c <2e-16 <2e-16 -      -      -   
d <2e-16 <2e-16 0.73   -      -   
e <2e-16 <2e-16 <2e-16 <2e-16 -   
f <2e-16 <2e-16 <2e-16 <2e-16 0.63

P value adjustment method: none 
> pairwise.t.test(d$value, d$variable, p.adjust = "bon")

    Pairwise comparisons using t tests with pooled SD 

data:  d$value and d$variable 

  a      b      c      d      e
b 1      -      -      -      -
c <2e-16 <2e-16 -      -      -
d <2e-16 <2e-16 1      -      -
e <2e-16 <2e-16 <2e-16 <2e-16 -
f <2e-16 <2e-16 <2e-16 <2e-16 1

P value adjustment method: bonferroni 

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  天涯浪人        
                
              
                            
                2020-11-28 14:35
              
            
            
                                                                       
Try this one

X <- rnorm(n=50, mean = 10, sd = 5)
Y <- rnorm(n=50, mean = 15, sd = 6)
Z <- rnorm(n=50, mean = 20, sd = 5)
Data <- data.frame(X, Y, Z)

library(plyr)

combos <- combn(ncol(Data),2)

adply(combos, 2, function(x) {
  test <- t.test(Data[, x[1]], Data[, x[2]])

  out <- data.frame("var1" = colnames(Data)[x[1]]
                    , "var2" = colnames(Data[x[2]])
                    , "t.value" = sprintf("%.3f", test$statistic)
                    ,  "df"= test$parameter
                    ,  "p.value" = sprintf("%.3f", test$p.value)
                    )
  return(out)

})



  X1 var1  var2 t.value       df p.value
1  1   X      Y  -5.598 92.74744   0.000
2  2   X      Z  -9.361 90.12561   0.000
3  3   Y      Z  -3.601 97.62511   0.000

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  你的背包        
                
              
                            
                2020-11-28 14:44
              
            
            
                                                                       
Here is another solution, with outer.

outer( 
  1:ncol(Data), 1:ncol(Data), 
  Vectorize(
    function (i,j) t.test(Data[,i], Data[,j])$p.value
  ) 
)

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  Happy的楠姐        
                
              
                            
                2020-11-28 14:48
              
            
            
                                                                       
I run this:

tres<-apply(x,1,t.test)
pval<-vapply(tres, "[[", 0, i = "p.value")


It took me a while to divine the "vapply" trick to pull the pvals out of the t.test result object list. (I edited this from 'sapply' because of Henrik's comment below) 

If it's a paired t-test, you can just subtract and test for means=0, which gives exactly the same result (that's all a paired t.test is):

tres<-apply(y-x,1,t.test)
pval<-vapply(tres, "[[", 0, i = "p.value")


Again this is a per-row t-test over all columns.   
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
                             
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复