I am trying to read a CSV file that has barcodes in the first column, but when R gets it into a data.frame, it converts 1665535004661
to 1.67E+12
.<
From the ?is.integer page:
"Note that current implementations of R use 32-bit integers for integer vectors, so the range of representable integers is restricted to about +/-2*10^9?
1665535004661L > 2*10^9 [1] TRUE
You want package Rmpfr.
library(Rmpfr)
x <- mpfr(15, precBits= 1024)
It's not in a "1.67E+12 format", it just won't print entirely using the defaults. R is reading it in just fine and the whole number is there.
x <- 1665535004661
> x
[1] 1.665535e+12
> print(x, digits = 16)
[1] 1665535004661
See, the numbers were there all along. They don't get lost unless you have a really large number of digits. Sorting on what you brought in will work fine and you can just explicitly call print() with the digits option to see your data.frame instead of implicitly by typing the name.
try working with colClasses="character"
read.csv("file.csv", colClasses = "character")
http://stat.ethz.ch/R-manual/R-devel/library/utils/html/read.table.html
Have a look at this link.
Take a look at the int64
package: Bringing 64-bit data to R.
Since you are not performing arithmetic on this value, character is appropriate. You can use the colClasses argument to set various classes for each column, which is probably better than using all character.
data.csv:
a,b,c
1001002003003004,2,3
Read character, then integers:
x <- read.csv('test.csv',colClasses=c('character','integer','integer'))
x
a b c
1 1001002003003004 2 3
mode(x$a)
[1] "character"
mode(x$b)
[1] "numeric"
I tend to use options(scipen = 9999999999)
at the start of every script. Outputs numbers to large number of decimal places instead of scientific format. You can change the number of '9's to however many decimals to display. There's a way to set this in global options, but I'm not 100% sure how.