问题
I wanted to calculate the machine Epsilon, the smallest possible number e
that gives 1 + e > 1
using different data types of C++: float
, double
and long double
.
Here's my code:
#include <cstdio>
template<typename T>
T machineeps() {
T epsilon = 1;
T expression;
do {
epsilon = epsilon / 2;
expression = 1 + epsilon;
} while(expression > 1);
return epsilon;
}
int main() {
auto epsf = machineeps<float>();
auto epsd = machineeps<double>();
auto epsld = machineeps<long double>();
std::printf("epsilon float: %22.17e\nepsilon double: %22.17e\nepsilon long double: %Le\n", epsf, epsd, epsld);
return 0;
}
But I get this strange output:
epsilon float: 5.96046447753906250e-008
epsilon double: 1.11022302462515650e-016
epsilon long double: -0.000000e+000
The values for float
and double
are what I was expecting, but, I cannot explain the long double
behavior.
Can somebody tell me what went wrong?
回答1:
I cannot reproduce your results. I get:
epsilon long double: 5.421011e-20
Anyway, logically, the code should be something like:
template<typename T>
T machineeps() {
T epsilon = 1, prev;
T expression;
do {
prev = epsilon;
epsilon = epsilon / 2;
expression = 1 + epsilon;
} while (expression > 1);
return prev; // <-- `1+prev` yields a result different from one
}
On my platform it produces values similar to std::numeric_limits::epsilon
:
epsilon float: 1.19209289550781250e-07
epsilon double: 2.22044604925031308e-16
epsilon long double: 1.084202e-19
(note the different order of magnitude)
回答2:
There are several things going on here.
First, floating-point math is often done at the maximum available precision, regardless of the actual declared type of the floating-point variable. So, for example, arithmetic on float
s is usually done with 80 bits of precision on Intel hardware (Java originally banned this, requiring all floating-point math to be done at the exact precision of the type; this killed floating-point performance, and they quickly abandoned that rule). Storing the result of a floating-point calculation is supposed to truncate the value to the appropriate type, but by default most compilers ignore this. You can tell your compiler not to allow that; the switch for that depends on the compiler. As is, you can’t rely on the result that’s being calculated here.
Second, the loop in the code terminates when the value of 1 + epsilon
is not greater than 1, so the returned value will be less than the true value of epsilon.
Third, coupled with the second one, some floating-point implementations don’t do subnormal values; once the exponent becomes smaller than the smallest that can be represented, the value is 0. That may be what you’re seeing here with the long double
value. IEEE floating-point handles zeros less abruptly — once you hit that minimum exponent, smaller values gradually lose precision. There are quite a few values between the smallest normalized value and 0.
来源:https://stackoverflow.com/questions/47133026/machine-epsilon-long-double-in-c