NAs produced by integer overflow + R on linux

后端 未结 3 1284
栀梦
栀梦 2021-01-13 18:56

I\'m running an R script on UNIX based system , the script contain multiplication of large numbers , so the results where NAs by integer overflow , but when i run the same s

3条回答
  •  天涯浪人
    2021-01-13 19:12

    As the other answers have pointed out, there is something a bit non-reproducible/strange about your results so far. Nevertheless, if you really must do exact calculations on large integers, you probably need an interface between R and some other system.

    Some of your choices are:

    • the gmp package (see this page and scroll down to R
    • an interface to the bc calculator on googlecode
    • there is a high precision arithmetic page on the R wiki which compares interfaces to Yacas, bc, and MPFR/GMP
    • there is a limited interface to the PARI/GP package in the elliptical package, but this is probably (much) less immediately useful than the preceding three choices

      Most Unix or Cygwin systems should have bc installed already. GMP and Yacas are easy to install on modern Linux systems ...

    Here's an extended example, with a function that can choose among numeric, integer, or bigz computation.

    f1 <- function(ol=1L,oh=1L,N=16569L,type=c("num","int","bigz")) {
      type <- match.arg(type)
      ## convert all values to appropriate type
      if (type=="int") {
        ol <- as.integer(ol)
        oh <- as.integer(oh)
        N <- as.integer(N)
        one <- 1L
        two <- 2L
        six <- 6L
        cc <- as.integer
      } else if (type=="bigz") {
        one <- as.bigz(1)
        two <- as.bigz(2)
        six <- as.bigz(6)
        N <- as.bigz(N)
        ol <- as.bigz(ol)
        oh <- as.bigz(oh)
        cc <- as.bigz
      } else {
        one <- 1
        two <- 2
        six <- 6
        N <- as.numeric(N)
        oh <- as.numeric(oh)
        ol <- as.numeric(ol)
        cc <- as.numeric
      }
      ## if using bigz mode, the ratio needs to be converted back to bigz;
      ## defining cc() as above seemed to be the most transparent way to do it
      N*ol^two + cc(N*(N+one)*(two*N+one)/six) -
        ol*(N*N+one) + two*N*ol*(N-oh+one) +
          (N-oh+one)*N^two + two*N*(oh-N-one)*(oh+N)
    }
    

    I removed a lot of unnecessary parentheses, which actually made it harder to see what was going on. It is indeed true that for the (1,1) case the final result is not bigger than .Machine$integer.max but some of the intermediate steps are ... (for the (1,1) case this actually reduces to $$-1/6*(N+2)*(4*N^2-5*N+3)$$ ...)

    f1()  ##  -3.032615e+12
    f1() > .Machine$integer.max  ## FALSE
    N <- 16569L
    N*(N+1)*(2*N+1) > .Machine$integer.max   ## TRUE
    N*(N+1L)*(2L*N+1L)  ## integer overflow (NA)
    f1(type="int")      ## integer overflow
    f1(type="bigz")     ##  "-3032615078557"
    print(f1(),digits=20)  ##  -3032615078557: no actual loss of precision in this case
    

    PS: you have a (N*N+1) term in your equation. Should that really be N*(N+1), or did you really mean N^2+1?

提交回复
热议问题