Using shapiro.test on multiple columns in a data frame

前端未结

关注

 3  1995

伪装坚强ぢ 2020-12-28 10:08

It seems like a pretty simple question, but I can\'t find the answer.

I have a dataframe (lets call it df), containing n=100 columns (C1, <

3条回答

囚心锁ツ (楼主)

2020-12-28 10:37

To apply some function over rows or columns of a data frame, one uses apply family:

df <- data.frame(a=rnorm(100), b=rnorm(100))    
df.shapiro <- apply(df, 2, shapiro.test)
df.shapiro
$a

    Shapiro-Wilk normality test

data:  newX[, i]
W = 0.9895, p-value = 0.6276


$b

    Shapiro-Wilk normality test

data:  newX[, i]
W = 0.9854, p-value = 0.3371

Note that column names are preserved, and df.shapiro is a named list.

Now, if you want, say, a vector of p-values, all you have to do is to extract them from appropriate lists:

unlist(lapply(df.shapiro, function(x) x$p.value))
        a         b 
0.6275521 0.3370931

0 讨论(0)

查看其它3个回答