drop columns that take less than n values?

后端 未结 2 438
忘了有多久
忘了有多久 2021-01-27 01:08

Suppose i have a data frame like the following:

df <- data.frame(v1 = sample(1:10, 100, replace = T), v2 = sample(LETTERS, 100, replace = T),
                         


        
相关标签:
2条回答
  • 2021-01-27 01:26

    Clunky but it works...

    x<-as.data.frame(t(apply(df,2,function(x) length(x[unique(x)]))>10))
    
    df[,names(x[,x>0])]
    
    0 讨论(0)
  • 2021-01-27 01:42

    Alternatively, you can use select_if() from dplyr where you can pass a function as predicate to select columns:

    library(dplyr)
    df %>% select_if(function(col) n_distinct(col) > 10)
    
    #    v2 V3 v4
    #1    T  a 12
    #2    R  k  7
    #3    L  l  1
    # ...
    
    0 讨论(0)
提交回复
热议问题