问题
This seems basic but I am having a lot of trouble answering the following question:
Give two numbers X and Y represented in the IEEE754 format such that computing X-Y will result in underflow.
To my understanding every operation can potentially result in underflow but for the life of mine I cant find an example for subtraction.
PLEASE HELP!!! thanks
回答1:
When default exception handling is in effect, a subtraction that produces a tiny (in the subnormal interval1) non-zero result conceptually causes an underflow exception, but there is no observable effect, because:
- A subtraction that produces a tiny result is necessarily exact, due to characteristics of the floating-point format (there are no significand bits lower than the bits in a subnormal value, and subtraction, unlike multiplication, cannot mathematically have any lower bits than there are in the inputs).
- The IEEE 754-2008 standard says that when there is underflow with default exception handling and the result is exact, no flag (including the underflow flag) is raised. And, since default exception handling is in effect, there is no trap (exceptional change of program control).
For a homework assignment, you may perform a subtraction that has a tiny result and legitimately claim that an underflow exception has occurred, even though no flag is raised and no trap occurred.
To create observable effects of the underflow exception, you would need to change the handling of the underflow exception from the default to something else, such as enabling a trap when an underflow occurs. The means for doing this are language dependent.
1 In the 32-bit binary format, a number is tiny if its magnitude is less than 2–126. In the 64-bit format, a number is tiny if its magnitude is less than 2–1023. The IEEE 754 standard permits tininess to be determined either before or after the result has been rounded to the normal significand length.
回答2:
The only possibility I see for getting an underflow on subtraction is to disable denormalized numbers. If you could do that, there would be pairs of distinct doubles whose difference would be too small to represent as a non-zero double.
来源:https://stackoverflow.com/questions/19053681/causing-underflow-in-ieee-754-floating-point-format-using-subtraction