If operator< works properly for floating-point types, why can't we use it for equality testing?

前端 未结 5 1337
借酒劲吻你
借酒劲吻你 2021-01-01 16:55

Properly testing two floating-point numbers for equality is something that a lot of people, including me, don\'t fully understand. Today, however, I thought about how some s

5条回答
  •  走了就别回头了
    2021-01-01 17:12

    When using floating-point numbers, the relational operators have meanings, but their meanings don't necessarily align with how actual numbers behave.

    If floating-point values are used to represent actual numbers (their normal purpose), the operators tend to behave as follows:

    • x > y and x >= y both imply that the numeric quantity which x is supposed to represent is likely greater than y, and at worst probably not much less than y.

    • x < y and x <= y both imply that the numeric quantity which x is supposed to represent is likely less than than y, and is at worst probably not much greater than y.

    • x == y implies that the numeric quantities which x and y represent are indistinguishable from each other

    Note that if x is of type float, and y is of type double, the above meanings will be achieved if the double argument is cast to float. In the absence of a specific cast, however, C and C++ (and also many other languages) will convert a float operand to double before performing a comparison. Such conversion will greatly reduce the likelihood that the operands will be reported "indistinguishable", but will greatly increase the likelihood that the comparison will yield a result contrary to what the intended numbers actually indicate. Consider, for example,

    float f = 16777217;
    double d = 16777216.5;
    

    If both operands are cast to float, the comparison will indicate that the values are indistinguishable. If they are cast to double, the comparison will indicate that d is larger even though the value f is supposed to represent is slightly bigger. As a more extreme example:

    float f = 1E20f;
    float f2 = f*f;
    double d = 1E150;
    double d2 = d*d;
    

    Float f2 contains the best float representation of 1E40. Double d2 contains the best double representation of 1E400. The numerical quantity represented by d2 is hundreds of orders of magnitude greater than that represented byf2, but(double)f2 > d2. By contrast, converting both operands to float would yieldf2 == (float)d2`, correctly reporting that the values are indistinguishable.

    PS--I am well aware that IEEE standards require that calculations be performed as though floating-point values represent precise power-of-two fractions, but few people seeing the code float f2 = f1 / 10.0; as being "Set f2 to the representable power-of-two fraction which is closest to being one tenth of the one in f1". The purpose of the code is to make f2 be a tenth of f1. Because of imprecision, the code cannot fulfill that purpose perfectly, but in most cases it's more helpful to regard floating-point numbers as representing actual numerical quantities than to regard them as power-of-two fractions.

提交回复
热议问题