double and float comparison [duplicate]

This question already has an answer here:

Comparing float and double 3 answers

According to this post, when comparing a float and a double, the float should be treated as double. The following program, does not seem to follow this statement. The behaviour looks quite unpredictable. Here is my program:

void main(void)
{
    double a = 1.1;  // 1.5
    float b = 1.1;   // 1.5
    printf("%X  %X\n", a, b);
    if ( a == b)
        cout << "success " <<endl;
    else
        cout << "fail" <<endl;
}

When I run the following program, I get "fail" displayed.
However, when I change a and b to 1.5, it displays "success".

I have also printed the hex notations of the values. They are different in both the cases. My compiler is Visual Studio 2005

Can you explain this output ? Thanks.

float f = 1.1;
double d = 1.1;
if (f == d)

In this comparison, the value of f is promoted to type double. The problem you're seeing isn't in the comparison, but in the initialization. 1.1 can't be represented exactly as a floating-point value, so the values stored in f and d are the nearest value that can be represented. But float and double are different sizes, so have a different number of significant bits. When the value in f is promoted to double, there's no way to get back the extra bits that were lost when the value was stored, so you end up with all zeros in the extra bits. Those zero bits don't match the bits in d, so the comparison is false. And the reason the comparison succeeds with 1.5 is that 1.5 can be represented exactly as a float and as a double; it has a bunch of zeros in its low bits, so when the promotion adds zeros the result is the same as the double representation.

ffhaddad

I found a decent explanation of the problem you are experiencing as well as some solutions.

See How dangerous is it to compare floating point values?

Just a side note, remember that some values can not be represented EXACTLY in IEEE 754 floating point representation. Your same example using a value of say 1.5 would compare as you expect because there is a perfect representation of 1.5 without any loss of data. However, 1.1 in 32-bit and 64-bit are in fact different values because the IEEE 754 standard can not perfectly represent 1.1.

See http://www.binaryconvert.com

double a = 1.1 --> 0x3FF199999999999A

Approximate representation = 1.10000000000000008881784197001

float  b = 1.1 --> 0x3f8ccccd

Approximate representation = 1.10000002384185791015625

As you can see, the two values are different.

Also, unless you are working in some limited memory type environment, it's somewhat pointless to use floats. Just use doubles and save yourself the headaches.

If you are not clear on why some values can not be accurately represented, consult a tutorial on how to covert a decimal to floating point.

Here's one: http://class.ece.iastate.edu/arun/CprE281_F05/ieee754/ie5.html

I would regard code which directly performs a comparison between a float and a double without a typecast to be broken; even if the language spec says that the float will be implicitly converted, there are two different ways that the comparison might sensibly be performed, and neither is sufficiently dominant to really justify a "silent" default behavior (i.e. one which compiles without generating a warning). If one wants to perform a conversion by having both operands evaluated as double, I would suggest adding an explicit type cast to make one's intentions clear. In most cases other than tests to see whether a particular double->float conversion will be reversible without loss of precision, however, I suspect that comparison between float values is probably more appropriate.

Fundamentally, when comparing floating-point values X and Y of any sort, one should regard comparisons as indicating that X or Y is larger, or that the numbers are "indistinguishable". A comparison which shows X is larger should be taken to indicate that the number that Y is supposed to represent is probably smaller than X or close to X. A comparison that says the numbers are indistinguishable means exactly that. If one views things in such fashion, comparisons performed by casting to float may not be as "informative" as those done with double, but are less likely to yield results that are just plain wrong. By comparison, consider:

double x, y;
float f = x;

If one compares f and y, it's possible that what one is interested in is how y compares with the value of x rounded to a float, but it's more likely that what one really wants to know is whether, knowing the rounded value of x, whether one can say anything about the relationship between x and y. If x is 0.1 and y is 0.2, f will have enough information to say whether x is larger than y; if y is 0.100000001, it will not. In the latter case, if both operands are cast to double, the comparison will erroneously imply that x was larger; if they are both cast to float, the comparison will report them as indistinguishable. Note that comparison results when casting both operands to double may be erroneous not only when values are within a part per million; they may be off by hundreds of orders of magnitude, such as if x=1e40 and y=1e300. Compare f and y as float and they'll compare indistinguishable; compare them as double and the smaller value will erroneously compare larger.

Adrian G

The reason why the rounding error occurs with 1.1 and not with 1.5 is due to the number of bits required to accurately represent a number like 0.1 in floating point format. In fact an accurate representation is not possible.

See How To Represent 0.1 In Floating Point Arithmetic And Decimal for an example, particularly the answer by @paxdiablo.

来源：https://stackoverflow.com/questions/17343155/double-and-float-comparison

标签

c++

floating-point

double

floating-point-precision