Error in shapiro.test : sample size must be between

纵然是瞬间 提交于 2020-12-28 07:44:41

问题


I have a vector, in R, with 1521298 points, which have to be tested for normality. I chose the Shapiro-Wilk test, but the R function shapiro.test() says:

Error in shapiro.test(z_scores) : sample size must be between 3 and 5000

Do you know any other function to test it or how to circumvent this issue?


回答1:


Shapiro test cannot done using more than 5.000 records.

You can try to do the shapiro test using only the first 5.000 samples. IF it can help you, use the code like this:

shapiro.test(beaver2$temp[0:5000])

But pay attention, the test will use only the first 5.000 samples of your data.

In the other hand, if you need to use all the records of your sample, use another similar test, like Anderson-Darling normality test. You also can execute both and compare, like this script below:

# clean workspace
rm(list=ls())

# Install required packages:
install.packages('nortest')

#Model data tho use
ModelData = beaver2$temp

#Do shapiro test with only the first 5000 records
shapiro.test(ModelData[0:5000])$p.value

#Anderson-Darling normality test
library(nortest)
ad.test(ModelData)$p.value



回答2:


You can try, Anderson-Darling normality test, which works for larger sample sizes.

library(nortest)
ad.test(data$variable)


来源:https://stackoverflow.com/questions/28217306/error-in-shapiro-test-sample-size-must-be-between

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!