While investigating floating-point exception status flags, I came across the curious case of a status flag FE_UNDERFLOW
set when not expected.
This is s
Although not intended as a self answer, input from various commenters @John Bollinger, @nwellnhof and further research leads to:
Can the floating-point status flag FE_UNDERFLOW set when the result is not sub-normal?
Yes, in narrow situations. See below.
"Underflow" occurs when:
The result underflows if the magnitude of the mathematical result is so small that the mathematical result cannot be represented, without extraordinary roundoff error, in an object of the specified type. C11 7.12.1 Treatment of error conditions
The z = y/2;
above is 1) inexact (due to rounding) and 2) maybe considered "too small".
Math
The z = y/2;
can be thought of going through 2 stages: dividing and rounding. The mathematical quotient, with unlimited precision, is less than the smallest normal number FLT_MIN
and more than the greatest sub-normal number nextafterf(FLT_MIN,0)
. Depending on rounding mode, the final answer is either one of those two. With FE_TONEAREST
, z
is assigned FLT_MIN
, a normal number.
Spec
The C spec below and to IEC 60559 indicate
The "underflow" floating-point exception is raised whenever a result is tiny (essentially subnormal or zero) and suffers loss of accuracy.358 C11 §F.10 7.
358 IEC 60559 allows different definitions of underflow. They all result in the same values, but differ on when the floating-point exception is raised.
and
Two definitions were allowed for the determination of the 'tiny' condition: before or after rounding the infinitely precise result to working precision, with unbounded exponent.
Annex U of 754r recommended that only tininess after rounding and inexact as loss of accuracy be a cause for underflow signal. wiki reference
(My emphasis)
Q & A
- If STDC_IEC_559 is defined, what is the correct state of underflow in this case?
The underflow flag may be set or left alone in this case. Either complies. There is a preference though, for not setting the underflow flag.
2 With lack of a defined STDC_IEC_559 am I stuck with "Implementations that do not define STDC_IEC_559 are not required to conform to these specifications." C11 or is there some C specification that indicates this result is incorrect?
The setting of the underflow flag result in not incorrect. The FP spec allows this behavior. It also allows to not set the underflow flag.
3 Since this is certainly a result of my hardware (processor), your result may differ and that would be interesting to know.
On another platform, where __STDC_IEC_559__ = not define
and FLT_EVAL_METHOD = 0
, the FE_INEXACT FE_UNDERFLOW
flags were both set, just like in the above tst case. The issue applies to float, double, long double
.
If the mathematical answer lies in the grey "Between" zone below, it will get rounding down to a sub-normal double
or up to the normal double
DBL_MIN
depending on its value and rounding mode. If rounded down, then FE_UNDERFLOW
is certainly set. If rounded up, then FE_UNDERFLOW
may be set or not depending on when determination of the 'tiny' condition is applied.