R data.table NA type consistency

后端 未结 1 1826
忘掉有多难
忘掉有多难 2021-01-27 01:15
dt = data.table(x = c(1,1,2,2,2,2,3,3,3,3))
dt[, y := if(.N > 2) .N else NA, by = x] # fail
dt[, y := if(.N > 2) .N else NA_integer_, by = x] # good
相关标签:
1条回答
  • 2021-01-27 01:58

    OP's first question: Is there a way to tell data table to ignore that and try to make all NAs to whatever type that keeps consistency?

    No. You'll see a similar error without the assignment:

    dt[, if(.N > 2) .N else NA, by = x]
    #  Error in `[.data.table`(dt, , if (.N > 2) .N else NA, by = x) : 
    # Column 1 of result for group 2 is type 'integer' but expecting type 'logical'. Column types must be consistent for each group.
    

    In my opinion, this "Column types must be consistent for each group." message should be shown for your case as well.


    OP's second question: BTW, what NA type should I use for Date/IDate/ITime?

    For IDate et al, I always subset by NA_integer_, which seems to give a length-one NA slice, e.g., as.IDate(Sys.Date())[NA_integer_]. I don't know if that's what one should do, but I don't know of a better idea. An illustration:

    z = IDateTime(factor(Sys.time()))
    #         idate    itime
    # 1: 2016-08-01 16:05:25
    
    str( lapply(z, function(x) x[NA_integer_]) )
    # List of 2
    #  $ idate: IDate[1:1], format: NA
    #  $ itime:Class 'ITime'  int NA
    
    0 讨论(0)
提交回复
热议问题