Extreme numerical values in floating-point precision in R

后端 未结 2 1939
名媛妹妹
名媛妹妹 2020-11-28 13:55

Can somebody please explain me the following output. I know that it has something to do with floating point precision, but the order of magnitue (difference 1e308) surprises

相关标签:
2条回答
  • 2020-11-28 14:25

    R uses IEEE 754 double-precision floating-point numbers.

    Floating-point numbers are more dense near zero. This is a result of their being designed to compute accurately (the equivalent of about 16 significant decimal digits, as you have noticed) over a very wide range.

    Perhaps you expected a fixed-point system with uniform absolute precision. In practice fixed-point is either wasteful or the ranges of each intermediate computation must be carefully estimated beforehand, with drastic consequences if they are wrong.

    Positive floating-point numbers look like this, schematically:

    +-+-+-+--+--+--+----+----+----+--------+--------+--------+--
    0
    

    The smallest positive normal double-precision number is 2 to the power of the minimal exponent. Near one, the double-precision floating-point numbers are already spread quite wide apart. There is a distance of 2-53 from one to the number below it, and a distance of 2-52 from one to the number above it.

    0 讨论(0)
  • 2020-11-28 14:28

    As per @PascalCuoq's answer, because the FP precision of an R double on an IEEE 754-compliant platform (e.g. an x86) is not quite 16 decimal digits:

    # Machine ULP in decimal digits...
    .Machine$double.ulp.digits * log10(2)
    -15.65...
    
    # Note the direct relationship between ULP digits and EPS:
    .Machine$double.ulp.digits = -52 
    2.22 e-16 = .Machine$double.eps == 2^.Machine$double.ulp.digits
    

    Hence 1 - 1e-16 is already very close to ULP, and 1 - 1e-17 is beyond ULP, and gets rounded to FP 1.0

    See the R documentation for .Machine : "Numerical Characteristics of the Machine" . In particular look at the difference between EPS and ULP.

    (ULP is defined wrt the FP number 1. The bigger your FP number gets, the bigger the value of the last bit, and the more crude a rounding operation is)

    As to where the quantity 1e-323 comes from: you're confusing ULP with the minimum-representable FP value, which is far smaller.

    The minimum-representable normalized positive FP value has exponent e-308, as per IEEE 754 Double-precision examples ...

    # Minimum-representable normalized positive FP value is...
    .Machine$double.xmin
    2.225..e-308
    
    # ...which would correspond to this base-2 exponent...
    log10(.Machine$double.xmin) / log10(2)
    -1022
    # ...or this base-10 exponent...
    .Machine$double.min.exp * log10(2)
    -307.65...
    

    But we can get slightly smaller, if we use an UNnormalized FP number, i.e. all the leading mantissa bits are 0. Hence as you empirically discovered, the minimum-representable UNnormalized positive FP value was somewhere between 1e-324 and 1e-323. That's because we have 52 mantissa bits, hence the numerical value of the LSB is 2^51 or 10^15.35 smaller:

    # Exponent of Minimum-representable UNnormalized positive FP value 
    log10(.Machine$double.xmin) - (.Machine$double.digits * log10(2))
    -323.607...
    

    (Why can we not empirically discover it exactly? Because IEEE-754 internally carries a few guard digits before rounding)

    (Note also the parameter that says the representation is base-2: .Machine$double.base = 2 )

    0 讨论(0)
提交回复
热议问题