问题
I'm building a program to to convert double values in to scientific value format(mantissa, exponent). Then I noticed the below
369.7900000000000 -> 3.6978999999999997428
68600000 -> 6.8599999999999994316
I noticed the same pattern for several other values also. The maximum fractional error is
0.000 000 000 000 001 = 1*e-15
I know the inaccuracy in representing double values in a computer. Can this be concluded that the maximum fractional error we would get is 1*e-15
? What is significant about this?
I went through most of the questions on floating point precision problem in stack overflow, but I didnt see any about the maximum fractional error in 64 bits.
To be clear on the computation I do, I have mentioned my code snippet as well
double norm = 68600000;
if (norm)
{
while (norm >= 10.0)
{
norm /= 10.0;
exp++;
}
while (norm < 1.0)
{
norm *= 10.0;
exp--;
}
}
Now I get
norm = 6.8599999999999994316;
exp = 7
回答1:
The number you are getting is related to the machine epsilon for the double
data type.
A double
is 64 bits long, with 1 bit for the sign, 11 bits for the exponent, and 52 bits for the mantissa fraction. A double
's value is given by
1.mmmmm... * (2^exp)
With only 52 bits for the mantissa, any double
value below 2^-52
will be completely lost when added to 1.0
due to its small significance. In binary, 1.0 + 2^-52
would be
1.000...00 + 0.000...01 = 1.000.....01
Obviously anything lower would not change the value of 1.0
. You can verify for yourself that 1.0 + 2^-53 == 1.0
in a program.
This number 2^-52 = 2.22e-16
is called the machine epsilon and is an upper bound on the relative error that occurs during one floating point arithmetic due to round-off error with double
values.
Similarly, float
has 23 bits in its mantissa and so its machine epsilon is 2^-23 = 1.19e-7
.
The reason you are getting 1e-15
may be because errors accumulate as you perform many arithmetic operations, but I can't say because I don't know the exact calculations you are doing.
EDIT: I've looked into the relative error for your problem with 68600000.
First off, you may be interested to know that round-off error can change the result of your computation if you break it into steps:
686.0/10.0 = 68.59999999999999431566
686.0/10.0/10.0 = 6.85999999999999943157
686.0/100.0 = 6.86000000000000031974
In the first line, the closest double
to 68.6 is lower than the actual value, but in the third line we see the closest double
to 6.86 is greater.
If we look at the abosolute error e_abs = abs(v-v_approx)
of your program, we see that it is
6.8600000 - 6.85999999999999943156581139192 ~= 5.684e-16
However, the relative error e_abs = abs( (v-v_approx)/ v) = abs(e_abs/v)
would be
5.684e-16 / 6.86 ~= 8.286e-17
Which is indeed below our machine epsilon of 2.22e-16
.
This is a famous paper you can read if you want to know all the details about floating point arithmetic.
来源:https://stackoverflow.com/questions/28846793/double-precision-error-when-converting-to-scientific-notation