When reading the text below, fread()
fails to detect the missing values in columns 8 and 9. This is only with the default option integer64=\"integer64\"
This apparently is an issue with the bit64 package, not fread()
or data.table
. From the bit64
documentation http://cran.r-project.org/web/packages/bit64/bit64.pdf
"Subscripting non-existing elements and subscripting with NAs is currently not supported. Such subscripting currently returns 9218868437227407266 instead of NA (the NA value of the un-derlying double code). Following the full R behaviour here would either destroy performance or require extensive C-coding."
I tried reassigning the 9218868437227407266 value to NA thinking it would work
Ex.
DT[V8==9218868437227407266, ]
#actually returns nothing, but
DT[V8==max(V8), ]
#returns the rows with 9218868437227407266 in V8
#but this does not reassign the value
DT[V8==max(V8), V8:=NA]
#not that this makes sense, but I tried just in case...
DT[V8==max(V8), V8:=NA_character_]
So as the documentation pretty clearly states, if a vector is class integer64 it won't recognize NA or missing values. I've going to avoid bit64 just to not have to deal with this...