Applying the same factor levels to multiple variables in an R data frame

前端 未结 2 1584
孤独总比滥情好
孤独总比滥情好 2020-12-05 20:49

I am working with a dataset that includes 16 questions where the response set is identical (Yes, No, Unknown or Missing). I am processing the data using R and I want to turn

相关标签:
2条回答
  • 2020-12-05 20:57

    An R base solution using apply

     data.frame(apply(df, 2, factor, 
                     levels=c(-9, 0, 1), 
                     labels = c("Unknown or Missing", "No", "Yes")))
    

    Using sapply

    data.frame(sapply(df, factor, levels=c(-9, 0, 1), 
             labels = c("Unknown or Missing", "No", "Yes")))
    
    0 讨论(0)
  • 2020-12-05 21:16
    df[] <- lapply(df, factor, 
                  levels=c(-9, 0, 1), 
                  labels = c("Unknown or Missing", "No", "Yes"))
    str(df)
    

    Likely to be faster than apply or sapply which need data.frame to reform/reclass those results. The trick here is that using [] on the LHS of the assignment preserves the structure of the target (because R "knows" what its class and dimensions are, and the need for data.frame on the list from lapply is not needed. If you had wanted to do this only with selected columns you could do this:

     df[colnums] <- lapply(df[colnums], factor, 
                  levels=c(-9, 0, 1), 
                  labels = c("Unknown or Missing", "No", "Yes"))
     str(df)
    
    0 讨论(0)
提交回复
热议问题