问题
I would like to use Chi-square for testing set of data. How to do it, using loop for or sapply.
This is a set of sample data:
n<-40
set.seed(1)
data <- data.frame(v1.1=sample(c('0','1'),n,replace=T),v1.2=sample(c('0','1'),n,replace=T),v1.3=sample(c('0','1'),n,replace=T),v1.4=sample(c('0','1'),n,replace=T),v1.5=sample(c('0','1'),n,replace=T),m1=sample(c('1','2'),n,replace=T))
I would like to test all variables named v1.x with variable m1. That's all.
I want to avoid such a situtation:
chisq.test(table(data$v1.1,data$m1))
chisq.test(table(data$v1.2,data$m1))
chisq.test(table(data$v1.3,data$m1))
chisq.test(table(data$v1.4,data$m1))
chisq.test(table(data$v1.5,data$m1))
I found this topic, but for me and for now it's too difficult.
回答1:
You can just use lapply
to loop through the variables.
myTests <- lapply(data[-length(data)], function(x) chisq.test(table(x, data$m1)))
This returns a named list, with the changin variable as the name of each list item.
names(myTests)
[1] "v1.1" "v1.2" "v1.3" "v1.4" "v1.5"
Then access each with myTests[[1]]
or myTests[["v1.1"]]
. These return
Pearson's Chi-squared test with Yates' continuity correction
data: table(x, data$m1)
X-squared = 0, df = 1, p-value = 1
Then, to pull out components from the individual tests, use names(myTests[[1]]
and str(myTests[[1]])
to inspect the contents. myTests[[1]]$p.value
, for example, will pull out the p.value from the first test and unlist(sapply(myTests, "[", "p.value"))
will return a named vector with p-values from all of the tests.
来源:https://stackoverflow.com/questions/46513021/automate-chi-square-across-columns