Context: double-double arithmetic
“Double-double” is a representation of numbers as the sum of two double-precision numbers without overlap in the significands. This representation takes advantage of existing double-precision hardware implementations for “near quadruple-precision” computations.
One typical low-level C function in a double-double implementation may take two double-precision numbers a
and b
with |a| ≥ |b|
and compute the double-double number (s, e)
that represents their sum:
s = a + b;
e = b - (s - a);
(Adapted from this article.)
These implementations typically assume round-to-nearest-even mode.
In the above computation, (s, e)
is a normalized double-double only because of this assumption. Without it, with a == 0x1.0p60
, b == 1
, in round-upward mode, s
is computed as 0x1.0000000000001p60
and e
a bit above -0x0.0000000000001p60
. Their sum is equal to the mathematical sum of a
and b
but their significands overlap.
Take a == 0x1.0p120
and the mathematical sums of a
and b
on the one hand and s
and e
on the other hand do not even coincide any more.
Question
Is there a way to build a double-double-like library with the same properties that a typical double-double library has in round-to-nearest-even (that is, relatively fast and relatively accurate), but that works whatever the rounding mode happens to be?
Does such a library already exist?
More general context: correctly rounded elementary functions
Implementations of the double-double sort are used for intermediate computations in the implementation of libraries of correctly rounded elementary functions. As a result, libraries implemented this way tend to fail spectacularly when a function is called while the FPU is not in round-to-nearest-even mode. Changing the rounding mode inside the function is not very palatable, for performance reasons and because a signal arriving while the function is executing would leave the FPU in round-to-nearest-even mode. The simplest way I see to have fast, correctly rounded elementary functions that work in any rounding mode would be if one could somehow rely on a double-double-kind of arithmetic that worked in any rounding mode.
The article referred to by njuffa offers the function below, with very similar notations to those of my question, except that what is denoted fl (a+b)
there is simply denoted a+b
in my question:
Two−Sum−toward−zero2 (a, b)
if (|a| < |b|)
swap (a , b)
s = fl (a + b)
d = fl (s − a)
e = fl (b − d)
if(|2 ∗ b|<|d|)
s = a, e = b
return (s, e)
This is a very neat fix for this particular elementary computation when in round-toward-zero mode. It makes one hope that something would be possible for implementing a correctly rounded elementary function, at the very least by testing the rounding mode early and picking separate algorithms, or perhaps by writing very careful code that works for all rounding modes.
来源:https://stackoverflow.com/questions/15853317/double-double-implementation-resilient-to-fpu-rounding-mode