问题
I have a problem to convert a long number to a string in R. How to easily convert a number to string to preserve precision? A have a simple example below.
a = -8664354335142704128
toString(a)
[1] "-8664354335142704128"
b = -8664354335142703762
toString(b)
[1] "-8664354335142704128"
a == b
[1] TRUE
I expected toString(a)
== toString(b)
, but I got different values. I suppose toString()
converts the number to float or something like that before converting to string.
Thank you for your help.
Edit:
> -8664354335142704128 == -8664354335142703762
[1] TRUE
> along = bit64::as.integer64(-8664354335142704128)
> blong = bit64::as.integer64(-8664354335142703762)
> along == blong
[1] TRUE
> blong
integer64
[1] -8664354335142704128
I also tried:
> as.character(blong)
[1] "-8664354335142704128"
> sprintf("%f", -8664354335142703762)
[1] "-8664354335142704128.000000"
> sprintf("%f", blong)
[1] "-0.000000"
Edit 2:
My question first was, if I can convert a long number to string without loss. Then I realized, in R is impossible to get the real value of a long number passed into a function, because R automatically read the value with the loss.
For example, I have the function:
> my_function <- function(long_number){
+ string_number <- toString(long_number)
+ print(string_number)
+ }
If someone used it and passed a long number, I am not able to get the information, which number was passed exactly.
> my_function(-8664354335142703762)
[1] "-8664354335142704128"
For example, if I read some numbers from a file, it is easy. But it is not my case. I just need to use something that some user passed.
I am not R expert, so I just was curious why in another language it works and in R not. For example in Python:
>>> def my_function(long_number):
... string_number = str(long_number)
... print(string_number)
...
>>> my_function(-8664354335142703762)
-8664354335142703762
Now I know, the problem is how R reads and stores numbers. Every language can do it differently. I have to change the way how to pass numbers to R function, and it solves my problem.
So the correct answer to my question is:
""I suppose toString() converts the number to float", nope, you did it yourself (even if unintentionally)." - Nope, R did it itself, that is the way how R reads numbers.
So I marked r2evans answer as the best answer because this user helped me to find the right solution. Thank you!
回答1:
Bottom line up front, you must (in this case) read in your large numbers as string before converting to 64-bit integers:
bit64::as.integer64("-8664354335142704128") == bit64::as.integer64("-8664354335142703762")
# [1] FALSE
Some points about what you've tried:
"I suppose toString() converts the number to float", nope, you did it yourself (even if unintentionally). In R, when creating a number,
5
is a float and5L
is an integer. Even if you had tried to create it as an integer, it would have complained and lost precision anyway:class(5) # [1] "numeric" class(5L) # [1] "integer" class(-8664354335142703762) # [1] "numeric" class(-8664354335142703762L) # Warning: non-integer value 8664354335142703762L qualified with L; using numeric value # [1] "numeric"
more appropriately, when you type it in as a number and then try to convert it, R processes the inside of the parentheses first. That is, with
bit64::as.integer64(-8664354335142704128)
R first has to parse and "understand" everything inside the parentheses before it can be passed to the function. (This is typically a compiler/language-parsing thing, not just an R thing.) In this case, it sees that it appears to be a (large) negative float, so it creates a class
numeric
(float). Only then does it send thisnumeric
to the function, but by this point the precision has already been lost. Ergo the otherwise-illogicalbit64::as.integer64(-8664354335142704128) == bit64::as.integer64(-8664354335142703762) # [1] TRUE
In this case, it just *happens that the 64-bit version of that number is equal to what you intended.
bit64::as.integer64(-8664254335142704128) # ends in 4128 # integer64 # [1] -8664254335142704128 # ends in 4128, yay! (coincidence?)
If you subtract one, it results in the same effective
integer64
:bit64::as.integer64(-8664354335142704127) # ends in 4127 # integer64 # [1] -8664354335142704128 # ends in 4128 ?
This continues for quite a while, until it finally shifts to the next rounding point
bit64::as.integer64(-8664254335142703617) # integer64 # [1] -8664254335142704128 bit64::as.integer64(-8664254335142703616) # integer64 # [1] -8664254335142703104
It is unlikely to be coincidence that the difference is 1024, or 2^10. I haven't fished yet, but I'm guessing there's something meaningful about this with respect to floating point precision in 32-bit land.
fortunately,
bit64::as.integer64
has several S3 methods, useful for converting different formats/classes to ainteger64
library(bit64) methods(as.integer64) # [1] as.integer64.character as.integer64.double as.integer64.factor # [4] as.integer64.integer as.integer64.integer64 as.integer64.logical # [7] as.integer64.NULL
So,
bit64::as.integer64.character
can be useful, since precision is not lost when you type it or read it in as a string:bit64::as.integer64("-8664354335142704128") # integer64 # [1] -8664354335142704128 bit64::as.integer64("-8664354335142704128") == bit64::as.integer64("-8664354335142703762") # [1] FALSE
FYI, your number is already near the 64-bit boundary:
-.Machine$integer.max # [1] -2147483647 -(2^31-1) # [1] -2147483647 log(8664354335142704128, 2) # [1] 62.9098 -2^63 # the approximate +/- range of 64-bit integers # [1] -9.223372e+18 -8664354335142704128 # [1] -8.664354e+18
来源:https://stackoverflow.com/questions/54681480/r-how-to-convert-long-number-to-string-to-save-precision