Floating-point optimizations - guideline

前端 未结 1 1982
一个人的身影
一个人的身影 2021-02-02 00:08

The majority of scientific computing problems that we need solve by implementing a particular algorithm in C/C++ demands accuracy that are much lower than double precision. For

1条回答
  •  故里飘歌
    2021-02-02 00:58

    Compiler makers justify the -ffast-math kind of optimizations with the assertion that these optimizations' influence over numerically stable algorithms is minimal.

    Therefore, if you want to write code that is robust against these optimizations, a sufficient condition is to write only numerically stable code.

    Now your question may be, “How do I write numerically stable code?”. This is where your question may be a bit broad: there are entire books dedicated to the subject. The Wikipedia page I already linked to has a good example, and here is another good one. I could not recommend a book in particular, this is not my area of expertise.

    Note 1: Numerical stability's desirability goes beyond compiler optimization. If you have choice, write numerically stable code even if you do not plan to use -ffast-math-style optimizations. Numerically unstable code may provide wrong results even when compiled with strict IEEE 754 floating-point semantics.

    Note 2: you cannot expect external libraries to work when compiled with -ffast-math-style flags. These libraries, written by floating-point experts, may need to play subtle tricks with the properties of IEEE 754 computations. This kind of trick may be broken by -ffast-math optimizations, but they improve performance more than you could expect the compiler to even if you let it. For floating-point computations, expert with domain knowledge beats compiler every time. On example amongst many is the triple-double implementation found in CRlibm. This code breaks if it is not compiled with strict IEEE 754 semantics. Another, more elementary algorithm that compiler optimizations break is Kahan summation: when compiled with unsafe optimizations, c = (t - sum) - y is optimized to c = 0. This, of course, defeats the purpose of the algorithm completely.

    0 讨论(0)
提交回复
热议问题