Conditionally calculate time differences between rows in R

人走茶凉 提交于 2019-12-11 13:09:20

问题


I'm trying to calculate the time difference between a row and a row that has a column that meets some criteria.

Reading in some data:

my_data <- data.frame(criteria = c("some text", "some more text", " ", " ", "more text", " "),
                  timestamp = as.POSIXct(c("2015-07-30 15:53:15", "2015-07-30 15:53:47", "2015-07-30 15:54:48", "2015-07-30 15:55:48", "2015-07-30 15:56:48", "2015-07-30 15:57:49")))

        criteria           timestamp
1      some text 2015-07-30 15:53:15
2 some more text 2015-07-30 15:53:47
3                2015-07-30 15:54:48
4                2015-07-30 15:55:48
5      more text 2015-07-30 15:56:48
6                2015-07-30 15:57:49

I want to get the time difference (in minutes) between every row and the last row that wasn't blank in the criteria column. Therefore, I want:

        criteria           timestamp time_diff
1      some text 2015-07-30 15:53:15         0
2 some more text 2015-07-30 15:53:47         0
3                2015-07-30 15:54:48         1
4                2015-07-30 15:55:48         2
5      more text 2015-07-30 15:56:48         0
6                2015-07-30 15:57:49         1

So far, I've built the code to recognize where the "0's" should be - I just need the code to fill in the time differences. Here's my code:

my_data$time_diff <- ifelse (my_data$criteria != "", # Here's our statement
  my_data$time_diff <- "0", # Here's what happens if statement is TRUE
  my_data$time_diff <- NEED CODE HERE # if statement FALSE
  )

I have a feeling that this job may be better performed by something that isn't an ifelse statement, but i'm relatively new to R.

I've found q's on here where individuals tried to get time differences between neighboring rows (e.g. here and here), but have yet to find someone trying to deal with this kind of situation.

The closest question I've found to mine is this one, but that data are different from mine in how the individual wants to process them (at least from my vantage point).

edit: capitalized title.


回答1:


Completing the answer with alexis_laz's masterful expression:

my_data <- data.frame(criteria = c("some text", "some more text", " ", " ", "more text", " "),
                      timestamp = as.POSIXct(c("2015-07-30 15:53:15", "2015-07-30 15:53:47", "2015-07-30 15:54:48", "2015-07-30 15:55:48", "2015-07-30 15:56:48", "2015-07-30 15:57:49")))

my_data$time_diff <- 
  my_data$timestamp - 
  my_data[cummax((my_data$criteria != " ") * seq_len(nrow(my_data))), 'timestamp']

my_data

        criteria           timestamp time_diff
1      some text 2015-07-30 15:53:15    0 secs
2 some more text 2015-07-30 15:53:47    0 secs
3                2015-07-30 15:54:48   61 secs
4                2015-07-30 15:55:48  121 secs
5      more text 2015-07-30 15:56:48    0 secs
6                2015-07-30 15:57:49   61 secs


来源:https://stackoverflow.com/questions/36989976/conditionally-calculate-time-differences-between-rows-in-r

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!