fread() fails with missing values in integer64 columns

前端 未结 2 1042
感情败类
感情败类 2021-02-14 07:53

When reading the text below, fread() fails to detect the missing values in columns 8 and 9. This is only with the default option integer64=\"integer64\"

2条回答
  •  予麋鹿
    予麋鹿 (楼主)
    2021-02-14 08:20

    This apparently is an issue with the bit64 package, not fread() or data.table. From the bit64 documentation http://cran.r-project.org/web/packages/bit64/bit64.pdf

    "Subscripting non-existing elements and subscripting with NAs is currently not supported. Such subscripting currently returns 9218868437227407266 instead of NA (the NA value of the un-derlying double code). Following the full R behaviour here would either destroy performance or require extensive C-coding."

    I tried reassigning the 9218868437227407266 value to NA thinking it would work

    Ex.

    DT[V8==9218868437227407266, ]
    #actually returns nothing, but
    DT[V8==max(V8), ]
    #returns the rows with 9218868437227407266 in V8
    #but this does not reassign the value 
    DT[V8==max(V8), V8:=NA]
    #not that this makes sense, but I tried just in case...
    DT[V8==max(V8), V8:=NA_character_]
    

    So as the documentation pretty clearly states, if a vector is class integer64 it won't recognize NA or missing values. I've going to avoid bit64 just to not have to deal with this...

提交回复
热议问题