R: How to convert long number to string to save precision

别来无恙 提交于 2021-01-27 11:59:45


I have a problem to convert a long number to a string in R. How to easily convert a number to string to preserve precision? A have a simple example below.

a = -8664354335142704128

[1] "-8664354335142704128"

b = -8664354335142703762

[1] "-8664354335142704128"

a == b

[1] TRUE

I expected toString(a) == toString(b), but I got different values. I suppose toString() converts the number to float or something like that before converting to string.

Thank you for your help.


> -8664354335142704128 == -8664354335142703762

[1] TRUE

> along = bit64::as.integer64(-8664354335142704128)
> blong = bit64::as.integer64(-8664354335142703762)
> along == blong

[1] TRUE

> blong

[1] -8664354335142704128

I also tried:

> as.character(blong)

[1] "-8664354335142704128"

> sprintf("%f", -8664354335142703762)

[1] "-8664354335142704128.000000"

> sprintf("%f", blong)

[1] "-0.000000"

Edit 2:

My question first was, if I can convert a long number to string without loss. Then I realized, in R is impossible to get the real value of a long number passed into a function, because R automatically read the value with the loss.

For example, I have the function:

> my_function <- function(long_number){
+ string_number <- toString(long_number)
+ print(string_number)
+ }

If someone used it and passed a long number, I am not able to get the information, which number was passed exactly.

> my_function(-8664354335142703762)
[1] "-8664354335142704128"

For example, if I read some numbers from a file, it is easy. But it is not my case. I just need to use something that some user passed.

I am not R expert, so I just was curious why in another language it works and in R not. For example in Python:

>>> def my_function(long_number):
...     string_number = str(long_number)
...     print(string_number)
>>> my_function(-8664354335142703762)

Now I know, the problem is how R reads and stores numbers. Every language can do it differently. I have to change the way how to pass numbers to R function, and it solves my problem.

So the correct answer to my question is:

""I suppose toString() converts the number to float", nope, you did it yourself (even if unintentionally)." - Nope, R did it itself, that is the way how R reads numbers.

So I marked r2evans answer as the best answer because this user helped me to find the right solution. Thank you!


Bottom line up front, you must (in this case) read in your large numbers as string before converting to 64-bit integers:

bit64::as.integer64("-8664354335142704128") == bit64::as.integer64("-8664354335142703762")
# [1] FALSE

Some points about what you've tried:

  • "I suppose toString() converts the number to float", nope, you did it yourself (even if unintentionally). In R, when creating a number, 5 is a float and 5L is an integer. Even if you had tried to create it as an integer, it would have complained and lost precision anyway:

    # [1] "numeric"
    # [1] "integer"
    # [1] "numeric"
    # Warning: non-integer value 8664354335142703762L qualified with L; using numeric value
    # [1] "numeric"
  • more appropriately, when you type it in as a number and then try to convert it, R processes the inside of the parentheses first. That is, with


    R first has to parse and "understand" everything inside the parentheses before it can be passed to the function. (This is typically a compiler/language-parsing thing, not just an R thing.) In this case, it sees that it appears to be a (large) negative float, so it creates a class numeric (float). Only then does it send this numeric to the function, but by this point the precision has already been lost. Ergo the otherwise-illogical

    bit64::as.integer64(-8664354335142704128) == bit64::as.integer64(-8664354335142703762)
    # [1] TRUE

    In this case, it just *happens that the 64-bit version of that number is equal to what you intended.

    bit64::as.integer64(-8664254335142704128)  # ends in 4128
    # integer64
    # [1] -8664254335142704128                 # ends in 4128, yay! (coincidence?)

    If you subtract one, it results in the same effective integer64:

    bit64::as.integer64(-8664354335142704127)  # ends in 4127
    # integer64
    # [1] -8664354335142704128                 # ends in 4128 ?

    This continues for quite a while, until it finally shifts to the next rounding point

    # integer64
    # [1] -8664254335142704128
    # integer64
    # [1] -8664254335142703104

    It is unlikely to be coincidence that the difference is 1024, or 2^10. I haven't fished yet, but I'm guessing there's something meaningful about this with respect to floating point precision in 32-bit land.

  • fortunately, bit64::as.integer64 has several S3 methods, useful for converting different formats/classes to a integer64

    # [1] as.integer64.character as.integer64.double    as.integer64.factor   
    # [4] as.integer64.integer   as.integer64.integer64 as.integer64.logical  
    # [7] as.integer64.NULL     

    So, bit64::as.integer64.character can be useful, since precision is not lost when you type it or read it in as a string:

    # integer64
    # [1] -8664354335142704128
    bit64::as.integer64("-8664354335142704128") == bit64::as.integer64("-8664354335142703762")
    # [1] FALSE
  • FYI, your number is already near the 64-bit boundary:

    # [1] -2147483647
    # [1] -2147483647
    log(8664354335142704128, 2)
    # [1] 62.9098
    -2^63 # the approximate +/- range of 64-bit integers
    # [1] -9.223372e+18
    # [1] -8.664354e+18

