问题
I have data on 'Start' and 'End' time for different jobs, grouped by 'owner':
Data <- data.frame(
job = c(1, 2, 3, 4, 5),
owner = c("name1", "name2", "name1", "name1", "name2"),
Start = as.POSIXct(c("2015-01-01 15:00:00", "2015-01-01 15:01:00", "2015-01-01 15:13:00", "2015-01-01 15:20:00", "2015-01-01 15:39:02"), format="%Y-%m-%d %H:%M:%S"),
End = as.POSIXct(c("2015-01-01 15:11:11", "2015-01-01 15:17:21", "2015-01-01 15:17:00", "2015-01-01 15:31:21", "2015-01-01 15:40:11"), format="%Y-%m-%d %H:%M:%S")
)
For each owner, I want to calculate the idle time between the jobs for each owner, i.e. the difference between the 'End' time of one job and the 'Start' time of the next job.
How do I use difftime()
to calculate this time difference between specific rows and times in different columns?
The result should look something like this:
job, owner, idletime
1, name1, NA
2, name2, NA
3, name1, 1.816667 # End of row 1 minus Start of row 3
4, name1, 3.0 # End of row 3 minus Start of row 4
...
回答1:
Here's a possible solution using data.table
library(data.table) # v 1.9.5+
setDT(Data)[, idletime := difftime(Start, shift(End), units = "mins"), by = owner]
# job owner Start End idletime
# 1: 1 name1 2015-01-01 15:00:00 2015-01-01 15:11:11 NA mins
# 2: 2 name2 2015-01-01 15:01:00 2015-01-01 15:17:21 NA mins
# 3: 3 name1 2015-01-01 15:13:00 2015-01-01 15:17:00 1.816667 mins
# 4: 4 name1 2015-01-01 15:20:00 2015-01-01 15:31:21 3.000000 mins
# 5: 5 name2 2015-01-01 15:39:02 2015-01-01 15:40:11 21.683333 mins
Or using dplyr
library(dplyr)
Data %>%
group_by(owner) %>%
mutate(idletime = difftime(Start, lag(End), units = "mins"))
# Source: local data frame [5 x 5]
# Groups: owner
#
# job owner Start End idletime
# 1 1 name1 2015-01-01 15:00:00 2015-01-01 15:11:11 NA mins
# 2 2 name2 2015-01-01 15:01:00 2015-01-01 15:17:21 NA mins
# 3 3 name1 2015-01-01 15:13:00 2015-01-01 15:17:00 1.816667 mins
# 4 4 name1 2015-01-01 15:20:00 2015-01-01 15:31:21 3.000000 mins
# 5 5 name2 2015-01-01 15:39:02 2015-01-01 15:40:11 21.683333 mins
回答2:
If we are using base R
, ave
would be one option. We get the lag
of the 'End' grouped by 'owner' using ave
, use that as the second argument in difftime
to create the 'idtime'.
Data$idtime <- with(Data, difftime(Start, ave(End, owner,FUN=lag), units='mins'))
Data
# job owner Start End idtime
#1 1 name1 2015-01-01 15:00:00 2015-01-01 15:11:11 NA mins
#2 2 name2 2015-01-01 15:01:00 2015-01-01 15:17:21 NA mins
#3 3 name1 2015-01-01 15:13:00 2015-01-01 15:17:00 1.816667 mins
#4 4 name1 2015-01-01 15:20:00 2015-01-01 15:31:21 3.000000 mins
#5 5 name2 2015-01-01 15:39:02 2015-01-01 15:40:11 21.683333 mins
NOTE: I named the column name as 'idtime' to keep the code in a single line :-)
回答3:
library(dplyr)
Data <- data.frame(
job = c(1, 2, 3, 4, 5),
owner = c("name1", "name2", "name1", "name1", "name2"),
Start = as.POSIXct(c("2015-01-01 15:00:00", "2015-01-01 15:01:00", "2015-01-01 15:13:00", "2015-01-01 15:20:00", "2015-01-01 15:39:02"), format="%Y-%m-%d %H:%M:%S"),
End = as.POSIXct(c("2015-01-01 15:11:11", "2015-01-01 15:17:21", "2015-01-01 15:17:00", "2015-01-01 15:31:21", "2015-01-01 15:40:11"), format="%Y-%m-%d %H:%M:%S")
)
Data %>%
group_by(owner) %>%
arrange(Start) %>%
mutate(lagEnd = lag(End),
idletime = difftime(Start,lagEnd, units="mins")) %>%
ungroup %>%
arrange(job) %>%
select(job,owner,idletime)
# job owner idletime
# 1 1 name1 NA mins
# 2 2 name2 NA mins
# 3 3 name1 1.816667 mins
# 4 4 name1 3.000000 mins
# 5 5 name2 21.683333 mins
来源:https://stackoverflow.com/questions/32199444/calculate-time-difference-difftime-between-columns-of-different-rows