Using shapiro.test on multiple columns in a data frame

前端 未结 3 1990
伪装坚强ぢ
伪装坚强ぢ 2020-12-28 10:08

It seems like a pretty simple question, but I can\'t find the answer.

I have a dataframe (lets call it df), containing n=100 columns (C1, <

3条回答
  •  囚心锁ツ
    2020-12-28 10:37

    To apply some function over rows or columns of a data frame, one uses apply family:

    df <- data.frame(a=rnorm(100), b=rnorm(100))    
    df.shapiro <- apply(df, 2, shapiro.test)
    df.shapiro
    $a
    
        Shapiro-Wilk normality test
    
    data:  newX[, i]
    W = 0.9895, p-value = 0.6276
    
    
    $b
    
        Shapiro-Wilk normality test
    
    data:  newX[, i]
    W = 0.9854, p-value = 0.3371
    

    Note that column names are preserved, and df.shapiro is a named list.

    Now, if you want, say, a vector of p-values, all you have to do is to extract them from appropriate lists:

    unlist(lapply(df.shapiro, function(x) x$p.value))
            a         b 
    0.6275521 0.3370931 
    

提交回复
热议问题