问题
In a longer script I have to multiply the length of a vector A (2614) with the numbers of rows of a dataframe B (1456000). If I do that directly with length(A) * nrow(B)
I get the message NAs produced by integer overflow
although there's no problem when I multiply the same numbers:
2614 * 1456000
[1] 3805984000
The only way to get the multiplication to work is round(length(A)) * nrow(B)
or length(A) * round(nrow(B))
. But the numbers produced by length
and nrow
must be integers anyhow! Moreover, I tested this with the following function suggested on the help page for the function is.integer...
is.wholenumber <- function(x, tol = .Machine$double.eps^0.5) abs(x-round(x)) < tol
... and of course, they ARE integers. So why do I need the crutches "round" here? Very puzzling... Somebody has got an idea what's going on in the background?
回答1:
Hopefully a graphic representation of what is happening....
2614 * 1456000
#[1] 3805984000
## Integers are actually represented as doubles
class( 2614 * 1456000 )
#[1] "numeric"
# Force numbers to be integers
2614L * 1456000L
#[1] NA
#Warning message:
#In 2614L * 1456000L : NAs produced by integer overflow
## And the result is an integer with overflow warning
class( 2614L * 1456000L )
#[1] "integer"
#Warning message:
#In 2614L * 1456000L : NAs produced by integer overflow
2614 * 1456000
is a numeric
because both the operands are actually of class numeric
. The overflow occurs because both nrow
and length
return integer
's and hence the result is an integer but the result exceeds the maximum size representable by the integer
class (+/-2*10^9). A numeric
or double
can hold 2e-308 to 2e+308
. So to solve your problem, just use as.numeric(length(A))
or as.double(length(A))
.
来源:https://stackoverflow.com/questions/17650803/r-simple-multiplication-causes-integer-overflow