C++ handling of excess precision

喜你入骨 提交于 2019-12-28 05:35:26

问题


I'm currently looking at code which does multi-precision floating-point arithmetic. To work correctly, that code requires values to be reduced to their final precision at well-defined points. So even if an intermediate result was computed to an 80 bit extended precision floating point register, at some point it has to be rounded to 64 bit double for subsequent operations.

The code uses a macro INEXACT to describe this requirement, but doesn't have a perfect definition. The gcc manual mentions -fexcess-precision=standard as a way to force well-defined precision for cast and assignment operations. However, it also writes:

‘-fexcess-precision=standard’ is not implemented for languages other than C

Now I'm thinking about porting those ideas to C++ (comments welcome if anyone knows an existing implementation). So it seems I can't use that switch for C++. But what is the g++ default behavior in absence of any switch? Are there more C++-like ways to control the handling of excess precision?

I guess that for my current use case, I'll probably use -mfpmath=sse in any case, which should not incur any excess precision as far as I know. But I'm still curious.


回答1:


Are there more C++-like ways to control the handling of excess precision?

The C99 standard defines FLT_EVAL_METHOD, a compiler-set macro that defines how excess precision should happen in a C program (many C compilers still behave in a way that does not exactly conform to the most reasonable interpretation of the value of FP_EVAL_METHOD that they define: older GCC versions generating 387 code, Clang when generating 387 code, …). Subtle points in relation with the effects of FLT_EVAL_METHOD were clarified in the C11 standard.

Since the 2011 standard, C++ defers to C99 for the definition of FLT_EVAL_METHOD (header cfloat).

So GCC should simply allow -fexcess-precision=standard for C++, and hopefully it eventually will. The same semantics as that of C are already in the C++ standard, they only need to be implemented in C++ compilers.


I guess that for my current use case, I'll probably use -mfpmath=sse in any case, which should not incur any excess precision as far as I know.

That is the usual solution.

Be aware that C99 also defines FP_CONTRACT in math.h that you may want to look at: it relates to the same problem of some expressions being computed at a higher precision, striking from a completely different side (the modern fused-multiply-add instruction instead of the old 387 instruction set). This is a pragma to decide whether the compiler is allowed to replace source-level additions and multiplications with FMA instructions (this has the effect that the multiplication is virtually computed at infinite precision, because this is how this instruction works, instead of being rounded to the precision of the type as it would be with separate multiplication and addition instructions). This pragma has apparently not been incorporated in the C++ standard (as far as I can see).

The default value for this option is implementation-defined and some people argue for the default to be to allow FMA instructions to be generated (for C compilers that otherwise define FLT_EVAL_METHOD as 0). You should, in C, future-proof your code with:

#include <math.h>
#pragma STDC FP_CONTRACT off

And the equivalent incantation in C++ if your compiler documents one.


what is the g++ default behavior in absence of any switch?

I am afraid that the answer to this question is that GCC's behavior, say, when generating 387 code, is nonsensical. See the description of the situation that motivated Joseph Myers to fix the situation for C. If g++ does not implement -fexcess-precision=standard, it probably means that 80-bit computations are randomly rounded to the precision of the type when the compiler happened to have to spill some floating-point registers to memory, leading the program below to print "foo" in some circumstances outside the programmer's control:

if (x == 0.0) return;
... // code that does not modify x
if (x == 0.0) printf("foo\n");

… because the code in the ellipsis caused x, that was held in an 80-bit floating-point register, to be spilt to a 64-bit slot on the stack.




回答2:


But what is the g++ default behavior in absence of any switch?

I found one answer myself via an experiment, using the following code:

#include <stdio.h>
#include <stdlib.h>

int main(int argc, char** argv) {
  double a = atof("1.2345678");
  double b = a*a;
  printf("%.20e\n", b - 1.52415765279683990130);
  return 0;
}

If b is rounded (-fexcess-precision=standard), then the result is zero. Otherwise (-fexcess-precision=fast) it is something like 8e-17. Compiling with -mfpmath=387 -O3, I could reproduce both cases for gcc-4.8.2. For g++-4.8.2 I get an error for -fexcess-precision=standard if I try that, and without a flag I get the same behavior as -fexcess-precision=fast gives for C. Adding -std=c++11 does not help. So now the suspicion already voiced by Pascal is official: g++ does not necessarily round everywhere it should.



来源:https://stackoverflow.com/questions/20869904/c-handling-of-excess-precision

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!