问题
I would like to combine factor levels that are used fewer than 5 times for each factor in a dataset containing many different factors. While I understand that the fct_lump() function in the forcats package can help me achieve this for a single factor, is there a function where I can apply the fct_lump() function over all the factors in my dataset?
回答1:
We can check whether the column is factor
with mutate_if
and apply fct_lump
library(dplyr)
library(forcats)
df1 %>%
mutate_if(is.factor, fct_lump)
Or in base R
i1 <- sapply(df1, is.factor)
df1[i1] <- lapply(df1[i1], fct_lump)
来源:https://stackoverflow.com/questions/60956344/combine-rarely-used-factor-levels-across-all-factors-in-data-frame