How to remove more than 2 consecutive NA's in a column?

梦想的初衷 提交于 2019-12-07 08:12:34

问题


I am new to R, In my data Frame I have col1("Timestamp"), col2("Values"). I have to remove rows of more than 2 consecutive NA in col2. My dataframe Looks like the below one,

Timestamp  | values  
-----------|--------
2011-01-02 |  2  
2011-01-03 |  3  
2011-01-04 |  NA  
2011-01-05 |  1  
2011-01-06 |  NA  
2011-01-07 |  NA    
2011-01-08 |  8  
2011-01-09 |  6  
2011-01-10 |  NA  
2011-01-11 |  NA  
2011-01-12 |  NA  
2011-01-13 |  2  

I would like to remove more than 2 duplicate rows based on second column. Expected output -

Timestamp  | values  
-----------|--------
2011-01-02 |  2  
2011-01-03 |  3  
2011-01-04 |  NA  
2011-01-05 |  1  
2011-01-06 |  NA  
2011-01-07 |  NA    
2011-01-08 |  8  
2011-01-09 |  6 
2011-01-13 |  2  

I'm looking for the solution thanks in advance.


回答1:


You can use the run length encoding function rle. I assume that the data is already sorted by date.

r <- rle(is.na(df$values))                      # check runs of NA in value column
df[!rep(r$values & r$lengths > 2, r$lengths),]  # remove runs of >2 length



回答2:


Here is another option using rleid from data.table. Convert the 'data.frame' to 'data.table' (setDT(df1)), grouped by the run-length-id of 'values', we get the row index (.I) where the number of rows is greater than 2 (.N >2) and (&) all the 'values' are 'NA'. Extract the index ($V1) to subset the rows of original dataset.

library(data.table)
setDT(df1)[df1[, .I[!(.N >2 & all(is.na(values)))], rleid(is.na(values))]$V1]
#    Timestamp values
#1: 2011-01-02      2
#2: 2011-01-03      3
#3: 2011-01-04     NA
#4: 2011-01-05      1
#5: 2011-01-06     NA
#6: 2011-01-07     NA
#7: 2011-01-08      8
#8: 2011-01-09      6
#9: 2011-01-13      2



回答3:


You can this one liner code:

Df[!duplicated(Df$column),]


来源:https://stackoverflow.com/questions/42668059/how-to-remove-more-than-2-consecutive-nas-in-a-column

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!