I am having a bit of trouble understanding how the precision of these doubles affects the outcome of arithmetic operations in Matlab. I thought that since both a & b are
"Floating" point means just that--the precision is relative to the scale of the number itself.
In the specific example you gave, 1.22e-45 can be represented alone because the exponent can be adjusted to represent 10^-45, or approximately 2^-150.
On the other hand, 1.0 is represented in binary with scale 2^0 (i.e., 1).
To add these two values, you need to align their decimal points (er...binary points), meaning that all of the precision of 1.22e-45 is shifted 150-odd bits to the right.
Of course, IEEE double precision floating point values only have 53 bits of mantissa (precision), meaning that at the scale of 1.0, 1.22e-45 is effectively zero.
64-bit IEEE-754 floating point numbers have enough precision (with a 53 bit mantissa) to represent about 16 significant decimal digits. But it requires more like 45 significant decimal digits to tell the difference between (1+a) = 1.00....000122 and 1.000 for your example.
To add to what the other answers have said, you can use the MATLAB function EPS to visualize the precision issue you are running into. For a given double-precision floating-point number, the function EPS will tell you the distance from it to the next largest representable floating point number:
>> a = 1.22e-45;
>> b = 1;
>> eps(b)
ans =
2.2204e-016
Note that the next floating point number that is larger than 1 is 1.00000000000000022204..., and the value of a
doesn't even come close to half the distance between the two numbers. Hence a+b
ends up staying 1.
Incidentally, you can also see why a
is considered non-zero even though it is so small by looking at the smallest representable double-precision floating-point value using the function REALMIN:
>> realmin
ans =
2.2251e-308 %# MUCH smaller than a!