I have one double, and one int64_t. I want to know if they hold exactly the same value, and if converting one type into the other does not lose any information.
My curre
OP's code has a dependency that can be avoided.
For a successful compare, d
must be a whole number and round(d) == d
takes care of that. Even d
, as a NaN would fail that.
d
must be mathematically in the range of [INT64_MIN
... INT64_MAX
] and if the if
conditions properly insure that, then the final i == (int64_t)d
completes the test.
So the question comes down to comparing INT64
limits with the double
d
.
Let us assume FLT_RADIX == 2
, but not necessarily IEEE 754 binary64.
d >= INT64_MIN
is not a problem as -INT64_MIN
is a power of 2 and exactly converts to a double
of the same value, so the >=
is exact.
Code would like to do the mathematical d <= INT64_MAX
, but that may not work and so a problem. INT64_MAX
is a "power of 2 - 1" and may not convert exactly - it depends on if the precision of the double
exceeds 63 bits - rendering the compare unclear. A solution is to halve the comparison. d/2
suffers no precision loss and INT64_MAX/2 + 1
converts exactly to a double
power-of-2
d/2 < (INT64_MAX/2 + 1)
[Edit]
// or simply
d < ((double)(INT64_MAX/2 + 1))*2
Thus if code does not want to rely on the double
having less precision than uint64_t
. (Something that likely applies with long double
) a more portable solution would be
int int64EqualsDouble(int64_t i, double d) {
return (d >= INT64_MIN)
&& (d < ((double)(INT64_MAX/2 + 1))*2) // (d/2 < (INT64_MAX/2 + 1))
&& (round(d) == d)
&& (i == (int64_t)d);
}
Note: No rounding mode issues.
[Edit] Deeper limit explanation
Insuring mathematically, INT64_MIN <= d <= INT64_MAX
, can be re-stated as INT64_MIN <= d < (INT64_MAX + 1)
as we are dealing with whole numbers. Since the raw application of (double) (INT64_MAX + 1)
in code is certainly 0, an alternative, is ((double)(INT64_MAX/2 + 1))*2
. This can be extended for rare machines with double
of higher powers-of-2 to ((double)(INT64_MAX/FLT_RADIX + 1))*FLT_RADIX
. The comparison limits being exact powers-of-2, conversion to double
suffers no precision loss and (lo_limit >= d) && (d < hi_limit)
is exact, regardless of the precision of the floating point. Note: that a rare floating point with FLT_RADIX == 10
is still a problem.