Practices to limit floating point accuracy problems

后端 未结 2 1740
执笔经年
执笔经年 2021-01-22 10:24

As programmers, most (if not all of us) know that floating point numbers have a tendency to not be very accurate. I know that this problem can\'t be avoided entirely, but I am w

相关标签:
2条回答
  • 2021-01-22 10:42

    Use fixed-point mathematics where you can deal with a known limited precision.

    As an example, the Rockbox music player firmware uses almost entirely fixed-point media codecs.

    If you must be perfectly accurate, use an infinite-length storage type like those provided by the GMP library.

    If you're just trying to cut down on your errors, try to work as close to zero as possible, where the IEEE FP numbers are more precise. Reorder your operations to avoid letting your absolute values get too large.

    0 讨论(0)
  • 2021-01-22 10:57

    Floating point accuracy is a large subject, and some of the brightest computer scientists have been working on this issue for many years. If you either haven't studied fp accuracy, haven't thoroughly studied your cs problem, or can't rely on other teammates to fully understand, just stick with doubles, rather than 32-but floats, unless you're just doing computer graphics or the project calls for singles.

    Some tasks, like multiplication, are communicative. For example, using Python:

    >>>a*a*a*a*a*a    
    1.1044776737696922    
    >>> (a*a*a)*(a*a*a)    
    1.104477673769692    
    >>> (a*a)*(a*a)*(a*a)   
    1.104477673769692
    

    The answer comes out the same because the exponents are simply added together, while the mantissa (1.fraction...) are simply multiplied with no loss.

    On the other hand, if we perform subtraction and multiplication in the wrong order, we can get very different results.

    b = 1.00016789

    b*(b-1)

    0.00016791818705204833

    b*b - b

    0.00016791818705197414

    Even though this looks fine, if you look closely, you'll see only 11 decimal digits are correct. To view it another way, ((b*(b-1)) - (b*b-b))/b should be zero, algebraically, but it comes out to 7.417408056593443e-17. That may seem like a small error, but floating point error tends to add up in a negative way. Had we used single precision float b = 1.00016789, using C syntax, the problems would be much worse. You would then have only a few reliable decimal digits left after such a small set of operations.

    0 讨论(0)
提交回复
热议问题