I have a well balanced panel data set which contains NA observations. I will be using LOCF, and would like to know how many consecutive NA\'s are in each panel, before carry
This will do it:
data[, max(with(rle(is.na(x)), lengths[values])), by = id]
I just ran rle
to find all consecutive NA
's and picked the max length.
Here's a rather convoluted answer to the comment question of recovering the date ranges for the above max
:
data[, {
tmp = rle(is.na(x));
tmp$lengths[!tmp$values] = 0; # modify rle result to ignore non-NA's
n = which.max(tmp$lengths); # find the index in rle of longest NA sequence
tmp = rle(is.na(x)); # let's get back to the unmodified rle
start = sum(tmp$lengths[0:(n-1)]) + 1; # and find the start and end indices
end = sum(tmp$lengths[1:n]);
list(date[start], date[end], max(tmp$lengths[tmp$values]))
}, by = id]