R - Convert various dummy/logical variables into a single categorical variable/factor from their name

后端 未结 3 1092
别那么骄傲
别那么骄傲 2021-02-04 16:28

My question has strong similarities with this one and this other one, but my dataset is a little bit different and I can\'t seem to make those solutions work. Please excuse me i

相关标签:
3条回答
  • 2021-02-04 16:40

    Try:

    library(dplyr)
    library(tidyr)
    
    df %>% gather(type, value, -id) %>% na.omit() %>% select(-value) %>% arrange(id)
    

    Which gives:

    #  id       type
    #1  1 conditionA
    #2  2 conditionB
    #3  3 conditionC
    #4  4 conditionD
    #5  5 conditionA
    

    Update

    To handle the case you detailed in the comments, you could do the operation on the desired portion of the data frame and then left_join() the other columns:

    df %>% 
      select(starts_with("condition"), id) %>% 
      gather(type, value, -id) %>% 
      na.omit() %>% 
      select(-value) %>% 
      left_join(., df %>% select(-starts_with("condition"))) %>%
      arrange(id)
    
    0 讨论(0)
  • 2021-02-04 16:56
    library(tidyr)
    library(dplyr)
    
    df <- df %>%
      gather(type, count, -id)
    df <- df[complete.cases(df),][,-3]
    df[order(df$id),]
       id       type
    1   1 conditionA
    7   2 conditionB
    13  3 conditionC
    19  4 conditionD
    5   5 conditionA
    
    0 讨论(0)
  • 2021-02-04 17:06

    You can also try:

    colnames(df)[2:5][max.col(!is.na(df[,2:5]))]
    #[1] "conditionA" "conditionB" "conditionC" "conditionD" "conditionA"
    

    The above works if one and only one column has a value other than NA for each row. If the values of a row can be all NAs, then you can try:

    mat<-!is.na(df[,2:5])
    colnames(df)[2:5][max.col(mat)*(NA^!rowSums(mat))]
    
    0 讨论(0)
提交回复
热议问题