Get most frequent string from a data frame column

前端 未结 2 366
长发绾君心
长发绾君心 2021-01-20 02:58

I need to return the n most frequent occurrences of a string, using a multiple row data frame as the input. All the values are in the same column called \"MissingDates\"

相关标签:
2条回答
  • 2021-01-20 03:20

    Here's another possible solution.

    Set up some data,

    set.seed(5)
    ss1 <- sample(seq(s <- Sys.Date(), s+10, "day"), 20, TRUE)
    ss2 <- sample(seq(s <- Sys.Date(), s+10, "day"), 20, TRUE)
    ls1 <- list(ss1 = ss1, ss2 = ss2)
    

    Define the function:

    f <- function(x, n) sort(table(x), decreasing = TRUE)[1:n]
    

    Apply the function over the data:

    lapply(ls1, f, n = 3)
    # $ss1
    # x
    # 2014-09-08 2014-09-09 2014-09-07 
    #          3          3          2 
    # 
    # $ss2
    # x
    # 2014-09-10 2014-09-06 2014-09-07 
    #          4          3          2 
    
    0 讨论(0)
  • 2021-01-20 03:29

    It seems like you need something like:

    Function

    freqfunc <- function(x, n){
      tail(sort(table(unlist(strsplit(as.character(x), ", ")))), n)
    }
    

    Testing on your data set

    freqfunc(gaps$MissingDates, 5) # Five most frequent dates
    
    ## 1996-12-26 1997-12-26 1998-01-02 1999-12-31 2001-09-12 
    ##          4          4          4          4          4 
    
    0 讨论(0)
提交回复
热议问题