R: t-test over all columns

后端 未结 5 732
醉话见心
醉话见心 2020-11-28 14:22

I tried to do t-test to all columns (two at a time) of my data frame, and extract only the p-value. Here is what I have come up with:

for (i in c(5:525) ) {
         


        
相关标签:
5条回答
  • 2020-11-28 14:29

    Assuming your data frame looks something like this:

    df = data.frame(a=runif(100),
                    b=runif(100),
                    c=runif(100),
                    d=runif(100),
                    e=runif(100),
                    f=runif(100))
    

    the the following

    tests = lapply(seq(1,length(df),by=2),function(x){t.test(df[,x],df[,x+1])})
    

    will give you tests for each set of columns. Note that this will only give you a t.test for a & b, c & d, and e & f. if you wanted a & b, b & c, c & d, d & e, and e & f, then you would have to do:

    tests = lapply(seq(1,(length(df)-1)),function(x){t.test(df[,x],df[,x+1])})      
    

    finally if let's say you only want the P values from your tests then you can do this:

    pvals = sapply(tests, function(x){x$p.value})
    

    If you are not sure how to work with an object, try typing summary(tests), and str(tests[[1]]) - in this case test is a list of htest objects, and you want to know the structure of the htest object, not necessarily the list.

    Hope this helped!

    0 讨论(0)
  • 2020-11-28 14:33

    I would recommend to convert your data frame to long format and use pairwise.t.test with appropriate p.adjust:

    > library(reshape2)
    > 
    > df <- data.frame(a=runif(100),
    +          b=runif(100),
    +          c=runif(100)+0.5,
    +          d=runif(100)+0.5,
    +          e=runif(100)+1,
    +          f=runif(100)+1)
    > 
    > d <- melt(df)
    Using  as id variables
    > 
    > pairwise.t.test(d$value, d$variable, p.adjust = "none")
    
        Pairwise comparisons using t tests with pooled SD 
    
    data:  d$value and d$variable 
    
      a      b      c      d      e   
    b 0.86   -      -      -      -   
    c <2e-16 <2e-16 -      -      -   
    d <2e-16 <2e-16 0.73   -      -   
    e <2e-16 <2e-16 <2e-16 <2e-16 -   
    f <2e-16 <2e-16 <2e-16 <2e-16 0.63
    
    P value adjustment method: none 
    > pairwise.t.test(d$value, d$variable, p.adjust = "bon")
    
        Pairwise comparisons using t tests with pooled SD 
    
    data:  d$value and d$variable 
    
      a      b      c      d      e
    b 1      -      -      -      -
    c <2e-16 <2e-16 -      -      -
    d <2e-16 <2e-16 1      -      -
    e <2e-16 <2e-16 <2e-16 <2e-16 -
    f <2e-16 <2e-16 <2e-16 <2e-16 1
    
    P value adjustment method: bonferroni 
    
    0 讨论(0)
  • 2020-11-28 14:35

    Try this one

    X <- rnorm(n=50, mean = 10, sd = 5)
    Y <- rnorm(n=50, mean = 15, sd = 6)
    Z <- rnorm(n=50, mean = 20, sd = 5)
    Data <- data.frame(X, Y, Z)
    
    library(plyr)
    
    combos <- combn(ncol(Data),2)
    
    adply(combos, 2, function(x) {
      test <- t.test(Data[, x[1]], Data[, x[2]])
    
      out <- data.frame("var1" = colnames(Data)[x[1]]
                        , "var2" = colnames(Data[x[2]])
                        , "t.value" = sprintf("%.3f", test$statistic)
                        ,  "df"= test$parameter
                        ,  "p.value" = sprintf("%.3f", test$p.value)
                        )
      return(out)
    
    })
    
    
    
      X1 var1  var2 t.value       df p.value
    1  1   X      Y  -5.598 92.74744   0.000
    2  2   X      Z  -9.361 90.12561   0.000
    3  3   Y      Z  -3.601 97.62511   0.000
    
    0 讨论(0)
  • 2020-11-28 14:44

    Here is another solution, with outer.

    outer( 
      1:ncol(Data), 1:ncol(Data), 
      Vectorize(
        function (i,j) t.test(Data[,i], Data[,j])$p.value
      ) 
    )
    
    0 讨论(0)
  • 2020-11-28 14:48

    I run this:

    tres<-apply(x,1,t.test)
    pval<-vapply(tres, "[[", 0, i = "p.value")
    

    It took me a while to divine the "vapply" trick to pull the pvals out of the t.test result object list. (I edited this from 'sapply' because of Henrik's comment below)

    If it's a paired t-test, you can just subtract and test for means=0, which gives exactly the same result (that's all a paired t.test is):

    tres<-apply(y-x,1,t.test)
    pval<-vapply(tres, "[[", 0, i = "p.value")
    

    Again this is a per-row t-test over all columns.

    0 讨论(0)
提交回复
热议问题