Use dplyr's group_by to perform split-apply-combine

半世苍凉 提交于 2019-11-29 11:16:53

I could see two ways to do it, depending on how you want to use the output. You could pull out just the p-values from shapiro.test in summarise. Alternatively you could use do and save the results of each test in a list.

library(dplyr)

With summarise, pulling out just the p-values:

iris %>%
    group_by(Species) %>%
    summarise(stest = shapiro.test(Petal.Length)$p.value)

     Species      stest
1     setosa 0.05481147
2 versicolor 0.15847784
3  virginica 0.10977537

Using do:

tests = iris %>%
    group_by(Species) %>%
    do(test = shapiro.test(.$Petal.Length))

# Resulting list
tests$test

[[1]]

    Shapiro-Wilk normality test

data:  .$Petal.Length
W = 0.955, p-value = 0.05481


[[2]]

    Shapiro-Wilk normality test

data:  .$Petal.Length
W = 0.966, p-value = 0.1585


[[3]]

    Shapiro-Wilk normality test

data:  .$Petal.Length
W = 0.9622, p-value = 0.1098
Bastiaan Quast

If you use tidy() function from the broom package, to turn the output of shapiro.test() into a data.frame then you can use do().

iris %>%
  group_by(Species) %>%
  do(tidy(shapiro.test(.$Petal.Length)))

This gives you:

Source: local data frame [3 x 4]
Groups: Species [3]

Species statistic    p.value                      method
      <fctr>     <dbl>      <dbl>                      <fctr>
1     setosa 0.9549768 0.05481147 Shapiro-Wilk normality test
2 versicolor 0.9660044 0.15847784 Shapiro-Wilk normality test
3  virginica 0.9621864 0.10977537 Shapiro-Wilk normality test

This is adapted from my answere here.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!