How to apply a shapiro test by groups in R?

后端 未结 3 1904
萌比男神i
萌比男神i 2021-01-24 10:27

I have a dataframe where all my 90 variables have integer data, of the type:

code | variable1 | variable2 | variable3 | ...

AB | 2 | 3 | 10 |

相关标签:
3条回答
  • 2021-01-24 10:34

    Using mtcars data from R

    mydata<-mtcars
     kk<-Map(function(x)cbind(shapiro.test(x)$statistic,shapiro.test(x)$p.value),mydata)
    library(plyr)
    myout<-ldply(kk)
    names(myout)<-c("var","W","p.value")
    myout
        var         W      p.value
    1   mpg 0.9475648 1.228816e-01
    2   cyl 0.7533102 6.058378e-06
    3  disp 0.9200127 2.080660e-02
    4    hp 0.9334191 4.880736e-02
    5  drat 0.9458838 1.100604e-01
    6    wt 0.9432578 9.265551e-02
    7  qsec 0.9732511 5.935208e-01
    8    vs 0.6322636 9.737384e-08
    9    am 0.6250744 7.836356e-08
    10 gear 0.7727857 1.306847e-05
    11 carb 0.8510972 4.382401e-04
    
    0 讨论(0)
  • 2021-01-24 10:41

    The answer by @GegznaV was excellent but meanwhile, the tidyverse has some newer constructs like tidyr::pivot_longer replacing tidyr::gather, and the tidyverse authors recommend the nest-unnest syntax.

    Also, I replaced broom::tidy by broom::glance as it gives the statistics for more models (e.g. aov()).

    Here's the same example of @GegznaV rewritten in the updated tidyverse syntax:

    library(tidyverse)
    library(broom)
    
    mtcars %>% 
      select(-am, -wt) %>%
      pivot_longer(
        cols = everything(),
        names_to = "variable_name",
        values_to = "value"
      ) %>% 
      nest(data = -variable_name) %>% 
      mutate(
        shapiro = map(data, ~shapiro.test(.x$value)),
        glanced = map(shapiro, glance)
      ) %>% 
      unnest(glanced) %>% 
      select(variable_name, W = statistic, p.value) %>% 
      arrange(variable_name)
    
    

    which gives the same result:

    # A tibble: 9 x 3
      variable_name     W      p.value
      <chr>         <dbl>        <dbl>
    1 carb          0.851 0.000438    
    2 cyl           0.753 0.00000606  
    3 disp          0.920 0.0208      
    4 drat          0.946 0.110       
    5 gear          0.773 0.0000131   
    6 hp            0.933 0.0488      
    7 mpg           0.948 0.123       
    8 qsec          0.973 0.594       
    9 vs            0.632 0.0000000974
    
    0 讨论(0)
  • 2021-01-24 10:56

    Example with mtcars data.

    library(tidyverse)
    library(broom)
    
    mtcars %>% 
        select(-am, - wt) %>% # Remove unnecessary columns
        gather(key = "variable_name", value = "value") %>%
        group_by(variable_name)  %>% 
        do(broom::tidy(shapiro.test(.$value)))  %>% 
        ungroup()  %>% 
        select(variable_name, W = statistic, `p-value` = p.value)
    

    The result:

    # A tibble: 9 x 3
      variable_name     W    `p-value`
      <chr>         <dbl>        <dbl>
    1 carb          0.851 0.000438    
    2 cyl           0.753 0.00000606  
    3 disp          0.920 0.0208      
    4 drat          0.946 0.110       
    5 gear          0.773 0.0000131   
    6 hp            0.933 0.0488      
    7 mpg           0.948 0.123       
    8 qsec          0.973 0.594       
    9 vs            0.632 0.0000000974
    
    0 讨论(0)
提交回复
热议问题