For any finite floating point value, is it guaranteed that x - x == 0?

谁说胖子不能爱 提交于 2019-12-06 22:51:43

问题


Floating point values are inexact, which is why we should rarely use strict numerical equality in comparisons. For example, in Java this prints false (as seen on ideone.com):

System.out.println(.1 + .2 == .3);
// false

Usually the correct way to compare results of floating point calculations is to see if the absolute difference against some expected value is less than some tolerated epsilon.

System.out.println(Math.abs(.1 + .2 - .3) < .00000000000001);
// true

The question is about whether or not some operations can yield exact result. We know that for any non-finite floating point value x (i.e. either NaN or an infinity), x - x is ALWAYS NaN.

But if x is finite, is any of this guaranteed?

  1. x * -1 == -x
  2. x - x == 0

(In particular I'm most interested in Java behavior, but discussions for other languages are also welcome.)


For what it's worth, I think (and I may be wrong here) the answer is YES! I think it boils down to whether or not for any finite IEEE-754 floating point value, its additive inverse is always computable exactly. Since e.g. float and double has one dedicated bit just for the sign, this seems to be the case, since it only needs flipping of the sign bit to find the additive inverse (i.e. the significand should be left intact).

Related questions

  • Correct Way to Obtain The Most Negative Double
  • How many double numbers are there between 0.0 and 1.0?

回答1:


Although x - x may give you -0 rather than true 0, -0 compares as equal to 0, so you will be safe with your assumption that any finite number minus itself will compare equal to zero.

See Is there a floating point value of x, for which x-x == 0 is false? for more details.




回答2:


Both equalities are guaranteed with IEEE 754 floating-point, because the results of both x-x and x * -1 are representable exactly as floating-point numbers of the same precision as x. In this case, regardless of the rounding mode, the exact values have to be returned by a compliant implementation.

EDIT: Comparing to the .1 + .2 example.

You can't add .1 and .2 in IEEE 754 because you can't represent them to pass to +. Addition, subtraction, multiplication, division and square root return the unique floating-point value which, depending on the rounding mode, is immediately below, immediately above, nearest with a rule to handle ties, ..., the result of the operation on the same arguments in R. Consequently, when the result (in R) happens to be representable as a floating-point number, this number is automatically the result regardless of the rounding mode.

The fact that your compiler lets you write 0.1 as shorthand for a different, representable number without a warning is orthogonal to the definition of these operations. When you write - (0.1) for instance, the - is exact: it returns exactly the opposite of its argument. On the other hand, its argument is not 0.1, but the double that your compiler uses in its place.

In short, another part of the reason why the operation x * (-1) is exact is that -1 can be represented as a double.



来源:https://stackoverflow.com/questions/3599579/for-any-finite-floating-point-value-is-it-guaranteed-that-x-x-0

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!