Subset time series so that selected rows differs by a certain minimum time

前端 未结 1 971
余生分开走
余生分开走 2021-01-13 17:47

I\'m using a data.table in R to store a time series. I want to return a subset such that successive rows for the selected times are at least N seconds apart from the last ro

相关标签:
1条回答
  • 2021-01-13 18:45

    Here are a couple ways to use rolling joins to find the set of rows, w, in your subset:

    t_plus = 5
    
    # one join per row visited
    w   <- c()
    nxt <- 1L
    while(!is.na(nxt)){ 
      w   <- c(w, nxt) 
      nxt <- x[.(t[nxt]+t_plus), on=.(t), roll=-Inf, which=TRUE]
    }
    
    # join once on all rows
    w0  <- x[.(t+5), on=.(t), roll=-Inf, which=TRUE]
    
    w   <- c()
    nxt <- 1L
    while (!is.na(nxt)){ 
      w   <- c(w, nxt)
      nxt <- w0[nxt] 
    }
    

    Then you can subset like x[w].


    Comments

    In principle, there could be other subsets that satisfy the OP's condition "at least 5 seconds apart"; this is just the one found by iterating from the first row forward.

    The second way is based on @DavidArenburg's answer to the Q&A Henrik linked above. Although the question seems the same, I couldn't get that approach to work fully here.

    Generally, it's a bad idea to grow things in a loop in R (like I'm doing with w here). If you're running into performance problems, that might be a good area to improve in this code.

    0 讨论(0)
提交回复
热议问题