Float vs Double

后端 未结 7 1829
-上瘾入骨i
-上瘾入骨i 2021-01-17 23:56

Is there ever a case where a comparison (equals()) between two floating point values would return false if you compare them as DOUBLE

相关标签:
7条回答
  • 2021-01-18 00:07

    Would I ever get an incorrect result if I promote 2 floats to double and do a 64bit comparison rather than a 32bit comparison?

    No.

    If you start with two floats, which could be float variables (float x = foo();) or float constants (1.234234234f) then you can compare them directly, of course. If you convert them to double and then compare them then the results will be identical.

    This works because double is a super-set of float. That is, every value that can be stored in a float can be stored in a double. The range of the exponent and mantissa are both increased. There are billions of values that can be stored in a double but not in a float, but there are zero values that can be stored in a float but not a double.

    As discussed in my float comparison article it can be tricky to do a meaningful comparison between float or double values, because rounding errors may have crept in. But, converting both numbers from float to double doesn't not change this. All of the mentions of epsilons (which are often but not always needed) are completely orthogonal to the question.

    On the other hand, comparing a float to a double is madness. 1.1 (a double) is not equal to 1.1f (a float) because 1.1 cannot be exactly represented in either.

    0 讨论(0)
  • 2021-01-18 00:09

    I'm perhaps not answering the OP's question but rather responding to some more or less fuzzy advice which require clarifications.

    Comparing two floating point values for equality is absolutely possible and can be done. If the type is single or double precision is often of less importance.

    Having said that the steps leading up to the comparison itself require great care and a thorough understanding of floating-point dos and don'ts, whys and why nots.

    Consider the following C statements:

    result = a * b / c;
    result = (a * b) / c;
    result = a * (b / c);
    

    In most naive floating-point programming they are seen as "equivalent" i e producing the "same" result. In the real world of floating-point they may be. Or actually, the first two are equivalent (as the second follows C evaluation rules, i e operators of same priority left to right). The third may or may not be equivalent to the first twp.

    Why is this?

    "a * b / c" or "b / c * a" may cause the "inexact" exception i e an intermediate or the final result (or both) is (are) not exact(ly representable in floating point format). If this is the case the results will be more or less subtly different. This may or may not lead to the end results being amenable to an equality comparison. Being aware of this and single-stepping through operations one at a time - noting intermediate results - will allow the patient programmer to "beat the system" i e construct a quality floating-point comparison for practically any situation.

    For everyone else, passing over the equality comparison for floating-poiny numbers is good, solid advice.

    It's really a bit ironic because most programmers know that integer math results in predictable truncations in various situations. When it comes to floating-point almost everyone is more or less thunderstruck that results are not exact. Go figure.

    0 讨论(0)
  • 2021-01-18 00:10

    If you're converting doubles to floats and the difference between them is beyond the precision of the float type, you can run into trouble.

    For example, say you have the two double values:

    9.876543210
    9.876543211
    

    and that the precision of a float was only six decimal digits. That would mean that both float values would be 9.87654, hence equal, even though the double values themselves are not equal.

    However, if you're talking about floats being cast to doubles, then identical floats should give you identical doubles. If the floats are different, the extra precision will ensure the doubles are distinct as well.

    0 讨论(0)
  • 2021-01-18 00:11

    As long as you are not mixing promoted floats and natively calculated doubles in your comparison you should be ok, but take care:

    Comparing floats (or doubles) for equality is difficult - see this lengthy but excellent discussion.

    Here are some highlights:

    1. You can't use ==, because of problems with the limited precision of floating point formats

    2. float(0.1) and double(0.1) are different values (0.100000001490116119384765625 and 0.1000000000000000055511151231257827021181583404541015625) respectively. In your case, this means that comparing two floats (by converting to double) will probably be ok, but be careful if you want to compare a float with a double.

    3. It's common to use an epsilon or small value to make a relative comparison with (floats a and b are considered equal if a - b < epsilon). In C, float.h defines FLT_EPSILON for exactly this purpose. However, this type of comparison doesn't work where a and b are both very small, or both very large.

    4. You can address this by using a scaled-relative-to-the-sizes-of-a-and-b epsilon, but this breaks down in some cases (like comparisons to zero).

    5. You can compare the integer representations of the floating point numbers to find out how many representable floats there are between them. This is what Java's Float.equals() does. This is called the ULP difference, for "Units in Last Place" difference. It's generally good, but also breaks down when comparing against zero.

    The article concludes:

    Know what you’re doing

    There is no silver bullet. You have to choose wisely.

    • If you are comparing against zero, then relative epsilons and ULPs based comparisons are usually meaningless. You’ll need to use an absolute epsilon, whose value might be some small multiple of FLT_EPSILON and the inputs to your calculation. Maybe.
    • If you are comparing against a non-zero number then relative epsilons or ULPs based comparisons are probably what you want. You’ll probably want some small multiple of FLT_EPSILON for your relative epsilon, or some small number of ULPs. An absolute epsilon could be used if you knew exactly what number you were comparing against.
    • If you are comparing two arbitrary numbers that could be zero or non-zero then you need the kitchen sink. Good luck and God speed.

    So, to answer your question:

    • If you are downgrading doubles to floats, then you might lose precision, and incorrectly report two different doubles as equal (as paxdiablo points out.)
    • If you are upgrading identical floats to double, then the added precision won't be a problem unless you are comparing a float with a double (Say you'd got 1.234 in float, and you only had 4 decimal digits of accuracy, then the double 1.2345 MIGHT represent the same value as the float. In this case you'd probably be better to do the comparison at the precision of the float, or more generally, at the error level of the most inaccurate representation in the comparison).
    • If you know the number you'll be comparing with, you can follow the advice quoted above.
    • If you're comparing arbitrary numbers (which could be zero or non-zero), there's no way to compare them correctly in all cases - pick one comparison and know its limitations.

    A couple of practical considerations (since this sounds like it's for an assignment):

    • The epsilon comparison mentioned by most is probably fine (but include a discussion of the limitations in the write up). If you're ever planning to compare doubles to floats, try to do it in float, but if not, try to do all comparisons in double. Even better, just use doubles everywhere.

    • If you want to totally ace the assignment, include a write-up of the issues when comparing floats and the rationale for why you chose any particular comparison method.

    0 讨论(0)
  • 2021-01-18 00:18

    For the comparison between float f and double d, you can calculate the difference of f and d. If abs(f-d) is less than some threshold, you can think of the equality holds. These threshold could be either absolute or relative as your application requirement. There are some good solutions Here. And I hope it helpful.

    0 讨论(0)
  • 2021-01-18 00:21

    I don't understand why you're doing this at all. The == operator already caters for all possible types on both sides, with extensive rules on type coercion and widening which are already specified in the relevant language standards. All you have to do is use it.

    0 讨论(0)
提交回复
热议问题