Combine Rarely Used Factor Levels Across all Factors in Data Frame

筅森魡賤 提交于 2021-02-17 02:37:31

问题


I would like to combine factor levels that are used fewer than 5 times for each factor in a dataset containing many different factors. While I understand that the fct_lump() function in the forcats package can help me achieve this for a single factor, is there a function where I can apply the fct_lump() function over all the factors in my dataset?


回答1:


We can check whether the column is factor with mutate_if and apply fct_lump

library(dplyr)
library(forcats)
df1 %>%
     mutate_if(is.factor, fct_lump)

Or in base R

i1 <- sapply(df1, is.factor)
df1[i1] <- lapply(df1[i1], fct_lump)


来源:https://stackoverflow.com/questions/60956344/combine-rarely-used-factor-levels-across-all-factors-in-data-frame

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!