问题
I have run a survey using Google Forms. I downloaded the response dataset as a spreadsheet, but unfortunately when it comes to multiple choice, multiple anwsers responses, the data looks something like this:
Q1 Q2 Q3
1 "A, B ,C" S
2 "C, D" T
1 "A, C, E" U
3 "D" V
2 "B, E" Z
I would like to have it in a form similar to the below:
Q1 Q2 Q2A Q2B Q2C Q2D Q2E Q3
1 "A, B, C" 1 1 1 0 0 S
2 "C, D" 0 0 1 1 0 T
1 "A, C, E" 1 0 1 0 1 U
3 "D" 0 0 0 1 0 V
2 "B, E" 0 1 0 0 1 Z
Is there a clever way to do this? I have several multiple choice, multiple answers questions and more than 250 respondents, so I'd like to be able to do it easily.
Thanks in advance.
回答1:
Using dplyr
and tidyr
packages:
dat %>%
separate(Q2, paste0("v", 1:5), remove=F) %>%
gather(q2, val, v1:v5) %>%
na.exclude %>%
mutate(val=paste0("Q2", val), q2=1) %>%
spread(val, q2) %>%
select(Q1:Q2, Q2A:Q2E, Q3) %>%
mutate_at(vars(Q2A:Q2E), .funs=funs(replace(., is.na(.), 0)))
Q1 Q2 Q2A Q2B Q2C Q2D Q2E Q3
1 1 A, B ,C 1 1 1 0 0 S
2 1 A, C, E 1 0 1 0 1 U
3 2 B, E 0 1 0 0 1 Z
4 2 C, D 0 0 1 1 0 T
5 3 D 0 0 0 1 0 V
Input data:
dat <- structure(list(Q1 = c(1L, 2L, 1L, 3L, 2L), Q2 = structure(c(1L,
4L, 2L, 5L, 3L), .Label = c("A, B ,C", "A, C, E", "B, E", "C, D",
"D"), class = "factor"), Q3 = structure(1:5, .Label = c("S",
"T", "U", "V", "Z"), class = "factor")), .Names = c("Q1", "Q2",
"Q3"), class = "data.frame", row.names = c(NA, -5L))
来源:https://stackoverflow.com/questions/44109108/r-how-to-separate-multiple-choice-multiple-answers-questionnaire-data-that-goo