I need to return the n most frequent occurrences of a string, using a multiple row data frame as the input. All the values are in the same column called \"MissingDates\"
Here's another possible solution.
Set up some data,
set.seed(5)
ss1 <- sample(seq(s <- Sys.Date(), s+10, "day"), 20, TRUE)
ss2 <- sample(seq(s <- Sys.Date(), s+10, "day"), 20, TRUE)
ls1 <- list(ss1 = ss1, ss2 = ss2)
Define the function:
f <- function(x, n) sort(table(x), decreasing = TRUE)[1:n]
Apply the function over the data:
lapply(ls1, f, n = 3)
# $ss1
# x
# 2014-09-08 2014-09-09 2014-09-07
# 3 3 2
#
# $ss2
# x
# 2014-09-10 2014-09-06 2014-09-07
# 4 3 2
It seems like you need something like:
Function
freqfunc <- function(x, n){
tail(sort(table(unlist(strsplit(as.character(x), ", ")))), n)
}
Testing on your data set
freqfunc(gaps$MissingDates, 5) # Five most frequent dates
## 1996-12-26 1997-12-26 1998-01-02 1999-12-31 2001-09-12
## 4 4 4 4 4