Cleaning up factor levels (collapsing multiple levels/labels)

后端 未结 10 1903
礼貌的吻别
礼貌的吻别 2020-11-22 14:27

What is the most effective (ie efficient / appropriate) way to clean up a factor containing multiple levels that need to be collapsed? That is, how to combine two or more fa

10条回答
  •  感情败类
    2020-11-22 14:54

    You may use the below function for combining/collapsing multiple factors:

    combofactor <- function(pattern_vector,
             replacement_vector,
             data) {
     levels <- levels(data)
     for (i in 1:length(pattern_vector))
          levels[which(pattern_vector[i] == levels)] <-
            replacement_vector[i]
     levels(data) <- levels
      data
    }
    

    Example:

    Initialize x

    x <- factor(c(rep("Y",20),rep("N",20),rep("y",20),
    rep("yes",20),rep("Yes",20),rep("No",20)))
    

    Check the structure

    str(x)
    # Factor w/ 6 levels "N","No","y","Y",..: 4 4 4 4 4 4 4 4 4 4 ...
    

    Use the function:

    x_new <- combofactor(c("Y","N","y","yes"),c("Yes","No","Yes","Yes"),x)
    

    Recheck the structure:

    str(x_new)
    # Factor w/ 2 levels "No","Yes": 2 2 2 2 2 2 2 2 2 2 ...
    

提交回复
热议问题