Reshape/gather function to create dataset ready for multilevel analysis

前端 未结 2 1461
醉酒成梦
醉酒成梦 2021-01-14 21:25

I have a big dataset, with 240 cases representing 240 patients. They all have undergone neuropsychological tests and filled in questionnaires. Additionally, their significan

相关标签:
2条回答
  • 2021-01-14 22:12

    If I understand what you want correctly, you can gather everything to a very long form and then reshape back to a slightly wider form:

    library(tidyverse)
    set.seed(47)    # for reproducibility
    
    mydf <- data.frame(id = c(1:5),
                       p1 = c(sample(1:10, 5)),
                       p2 = c(sample(10:20, 5)),
                       p3 = c(sample(20:30, 5)),
                       pr1 = c(sample(1:10, 5)),
                       pr2 = c(sample(10:20, 5)),
                       pr3 = c(sample(20:30, 5)))
    
    mydf_long <- mydf %>% 
        gather(var, val, -id) %>% 
        separate(var, c('couple', 'q'), -2) %>% 
        mutate(q = paste0('q', q)) %>% 
        spread(q, val)
    
    mydf_long
    #>    id couple q1 q2 q3
    #> 1   1      p 10 17 21
    #> 2   1     pr 10 11 24
    #> 3   2      p  4 13 27
    #> 4   2     pr  4 15 20
    #> 5   3      p  7 14 30
    #> 6   3     pr  1 14 29
    #> 7   4      p  6 18 24
    #> 8   4     pr  8 20 30
    #> 9   5      p  9 16 23
    #> 10  5     pr  3 18 25
    
    0 讨论(0)
  • 2021-01-14 22:13

    One approach would be to use unite and separate in tidyr, along with the gather function as well.

    I'm using your mydf data frame since it was provided, but it should be pretty straightforward to make any changes:

    mydf %>% 
      unite(p1:p3, col = `1`, sep = ";") %>% # Combine responses of 'p1' through 'p3'
      unite(pr1:pr3, col = `2`, sep = ";") %>% # Combine responses of 'pr1' through 'pr3'
      gather(couple, value, `1`:`2`) %>% # Form into long data
      separate(value, sep = ";", into = c("q1", "q2", "q3"), convert = TRUE) %>% # Separate and retrieve original answers
      arrange(id)
    

    Which gives you:

       id couple q1 q2 q3
    1   1      1  9 18 25
    2   1      2 10 18 30
    3   2      1  1 11 29
    4   2      2  2 15 29
    5   3      1 10 19 26
    6   3      2  3 19 25
    7   4      1  7 10 23
    8   4      2  1 20 28
    9   5      1  6 16 21
    10  5      2  5 12 26
    

    Our numbers are different since they were all randomly generated with sample.


    Edited per @alistaire comment: add convert = TRUE to the separate call to make sure the responses are still of class integer.

    0 讨论(0)
提交回复
热议问题