Removing time series with only zero values from a data frame

前端 未结 3 1788
感情败类
感情败类 2021-01-13 20:35

I have a data frame with multiple time series identified by uniquer id\'s. I would like to remove any time series that have only 0 values.

The data frame looks as fo

相关标签:
3条回答
  • 2021-01-13 21:06

    If dat is a data.table, then this is easy to write and read :

    dat[,.SD[any(value!=0)],by=id]
    

    .SD stands for Subset of Data. This answer explains .SD very well.

    Picking up on Gabor's nice use of ave, but without repeating the same variable name (DF) three times, which can be a source of typo bugs if you have a lot of long or similar variable names, try :

    dat[ ave(value!=0,id,FUN=any) ]
    

    The difference in speed between those two may be dependent on several factors including: i) number of groups ii) size of each group and iii) the number of columns in the real dat.

    0 讨论(0)
  • 2021-01-13 21:06

    An easy plyr solution would be

    ddply(mydat,"id",function(x) if (all(x$value==0)) NULL else x)
    

    (seems to work OK) but there may be a faster solution with data.table ...

    0 讨论(0)
  • 2021-01-13 21:24

    Try this. No packages are used.

    DF[ ave(DF$value != 0, DF$id, FUN = any), ]
    
    0 讨论(0)
提交回复
热议问题