Grouped mean of difftime fails in data.table

爱⌒轻易说出口 提交于 2020-08-02 08:39:15

问题


Preface:

I have a column in a data.table of difftime values with units set to days. I am trying to create another data.table summarizing the values with

dt2 <- dt[, .(AvgTime = mean(DiffTime)), by = Group]

When printing the new data.table, I see values such as

1.925988e+00 days
1.143287e+00 days
1.453975e+01 days

I would like to limit the decimal place values for this column only (i.e. not setting options() unless I can do this specifically for difftime values this way). When I try to do this using the method above, modified, e.g.

dt2 <- dt[, .(AvgTime = round(mean(DiffTime)), 2), by = Group]

I am left with NA values, with both the base round() and format() functions returning the warning:

In mean(DiffTime) : argument is not numeric or logical.

Oddly enough, if I perform the same operation on a numeric field, this runs with no problems. Also, if I run the two separate lines of code, I can accomplish what I am looking to do:

dt2 <- dt[, .(AvgTime = mean(DiffTime)), by = Group]
dt2[, AvgTime := round(AvgTime, 2)]

Reproducible Example:

library(data.table)
set.seed(1)
dt <- data.table(
  Date1 = 
    sample(seq(as.Date('2017/10/01'), 
               as.Date('2017/10/31'), 
               by="days"), 24, replace = FALSE) +
    abs(rnorm(24)) / 10,
  Date2 = 
    sample(seq(as.Date('2017/10/01'), 
               as.Date('2017/10/31'), 
               by="days"), 24, replace = FALSE) +
    abs(rnorm(24)) / 10,
  Num1 =
    abs(rnorm(24)) * 10,
  Group = 
    rep(LETTERS[1:4], each=6)
)
dt[, DiffTime := abs(difftime(Date1, Date2, units = 'days'))]

# Warnings/NA:
class(dt$DiffTime) # "difftime"
dt2 <- dt[, .(AvgTime = round(mean(DiffTime), 2)), by = .(Group)]

# Works when numeric/not difftime:
class(dt$Num1) # "numeric"
dt2 <- dt[, .(AvgNum = round(mean(Num1), 2)), by = .(Group)]

# Works, but takes an additional step:
dt2<-dt[,.(AvgTime = mean(DiffTime)), by = .(Group)]
dt2[,AvgTime := round(AvgTime,2)]

# Works with base::mean:
class(dt$DiffTime) # "difftime"
dt2 <- dt[, .(AvgTime = round(base::mean(DiffTime), 2)), by = .(Group)]

Question:

Why am I not able to complete this conversion (rounding of the mean) in one step when the class is difftime? Am I missing something in my execution? Is this some sort of bug in data.table where it can't properly handle the difftime?

Issue added on github.

Update: Issue appears to be cleared after updating from data.table version 1.10.4 to 1.12.8.


回答1:


This was fixed by update #3567 on 2019/05/15, data.table version 1.12.4 released 2019/10/03




回答2:


This might be a little late but if you really want it to work you can do:

as.numeric(round(as.difftime(difftime(DATE1, DATE2)), 0))



回答3:


I recently ran into the same problem using data.table_1.11.8. One quick work around is to use base::mean instead of mean.



来源:https://stackoverflow.com/questions/47270913/grouped-mean-of-difftime-fails-in-data-table

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!