问题
Is IEEE-754 arithmetic reproducible on different platforms?
I was testing some code written in R, that uses random numbers. I thought that setting the seed of the random number generator on all tested platforms would make the tests reproducible, but this does not seem to be true for rexp()
, which generates exponentially distributed random numbers.
This is what I get on 32 bit Linux:
options(digits=22) ; set.seed(9) ; rexp(1, 5)
# [1] 0.2806184054728815824298
sessionInfo()
# R version 3.0.2 (2013-09-25)
# Platform: i686-pc-linux-gnu (32-bit)
and this is what I get on 64 bit OSX 10.9:
options(digits=22) ; set.seed(9) ; rexp(1, 5)
# [1] 0.2806184054728815269186
sessionInfo()
# R version 3.0.2 (2013-09-25)
# Platform: x86_64-apple-darwin10.8.0 (64-bit)
64 bit Linux gives the same results as 64 bit OSX, so this seems to be a 32 bit vs 64 bit issue.
Let us assume that both R versions were compiled with the same GCC version, and with the same (default R) compilation flags that make the compiler use IEEE-754 arithmetic.
My question is, can this be considered a bug in R? Or is it just a "normal" consequence of using approximate, finite precision floating point arithmetic?
I sent the same question to the R-devel mailing list, but got no answer on the list, and only one answer in private, trying to convince me that this is not a bug and I should live with it.
This is what IEEE-754 says about reproducibility (from Wikipedia):
The IEEE 754-1985 allowed many variations in implementations (such as the encoding of some values and the detection of certain exceptions). IEEE 754-2008 has tightened up many of these, but a few variations still remain (especially for binary formats). The reproducibility clause recommends that language standards should provide a means to write reproducible programs (i.e., programs that will produce the same result in all implementations of a language), and describes what needs to be done to achieve reproducible results.
And this is under "Recommendations".
My (subjective) opinion is that this is a bug, because the whole point of the IEEE-754 standard is having reproducible, platform-independent floating point arithmetic.
回答1:
There are issues with reproducibility of even elementary floating-point operations in high-level languages, but they are usually controllable with various platform-specific operations such as setting compiler switches, using custom code to set floating-point controls and modes, or, if necessary, writing essential operations in assembly. As developed in comments, the specific problem you encountered may be that different C implementations use different precision to evaluate intermediate floating-point expressions. Often this can be controlled by compiler switches or by including casts and assignments in the expressions to require rounding to the nominal type (thus discarding excess precision).
However, the more complicated functions, such as exp
and cos
, are not routinely reproducible on different platforms. Although the 2008 IEEE-754 standard recommends that these functions be implemented with correct rounding, this task has not been completed for any math library with run-time known to be bounded. Nobody in the world has done the mathematics to accomplish this.
The CRlibm project has implemented some of the functions with known run-time bounds, but the work is incomplete. (Per Pascal Cuoq’s comment, when CRlibm does not have a proven run-time bound for correct rounding, it falls back to a result highly likely to be correctly rounded due to being computed with very high precision.) Figuring out how to deliver a correctly-rounded result in a bounded time and proving it is difficult for many functions. (Consider how you might prove that no value of cos(x)
, where x
is any double
value, is closer than some small distance e
from the midpoint between two representable values. The midpoint is important because it is where rounding must change from returning one result to returning another, and e
tells you how accurately and precisely you must calculate an approximation in order to provide correct rounding.)
The current state of affairs is that many of the functions in the math library are approximated, some accuracy looser than correct rounding is delivered, and different vendors use different implementations with different approximations. I am supposing that R
uses some of these functions in its rexp
implementation, and that it uses the native libraries of its target platforms, so it gets different results on different platforms.
To remedy this, you might consider using a common math library on the platforms you target (possibly CRlibm).
来源:https://stackoverflow.com/questions/21212326/floating-point-arithmetic-and-reproducibility