I have a dataframe where all my 90 variables have integer data, of the type:
code | variable1 | variable2 | variable3 | ...
AB | 2 | 3 | 10 |
The answer by @GegznaV was excellent but meanwhile, the tidyverse has some newer constructs like tidyr::pivot_longer
replacing tidyr::gather
, and the tidyverse authors recommend the nest-unnest
syntax.
Also, I replaced broom::tidy
by broom::glance
as it gives the statistics for more models (e.g. aov()
).
Here's the same example of @GegznaV rewritten in the updated tidyverse syntax:
library(tidyverse)
library(broom)
mtcars %>%
select(-am, -wt) %>%
pivot_longer(
cols = everything(),
names_to = "variable_name",
values_to = "value"
) %>%
nest(data = -variable_name) %>%
mutate(
shapiro = map(data, ~shapiro.test(.x$value)),
glanced = map(shapiro, glance)
) %>%
unnest(glanced) %>%
select(variable_name, W = statistic, p.value) %>%
arrange(variable_name)
which gives the same result:
# A tibble: 9 x 3
variable_name W p.value
1 carb 0.851 0.000438
2 cyl 0.753 0.00000606
3 disp 0.920 0.0208
4 drat 0.946 0.110
5 gear 0.773 0.0000131
6 hp 0.933 0.0488
7 mpg 0.948 0.123
8 qsec 0.973 0.594
9 vs 0.632 0.0000000974