How to use Gcc 4.6.0 libquadmath and __float128 on x86 and x86_64

后端 未结 1 1005
野的像风
野的像风 2021-02-01 22:48

I have medium size C99 program which uses long double type (80bit) for floating-point computation. I want to improve precision with new GCC 4.6 extension __fl

1条回答
  •  北恋
    北恋 (楼主)
    2021-02-01 23:24

    How should I convert my program from classic long double of 80-bit to quad floats of 128 bit with software emulation of full precision? What need I change? Compiler flags, sources?

    You need recent software, GCC version with support of __float128 type (4.6 and newer) and libquadmath (supported only on x86 and x86_64 targets; in IA64 and HPPA with newer GCC). You should add linker flag -lquadmath (the cannot find -lquadmath' will show that you have no libquadmath installed)

    • Add #include header to have macro and function definitions.
    • You should modify all long double variable definitions to __float128.
    • Complex variables may be changed to __complex128 type (quadmath.h) or directly with typedef _Complex float __attribute__((mode(TC))) _Complex128;
    • All simple arithmetic operations are automatically handled by GCC (converted to calls of helper functions like __*tf3()).
    • If you use any macro like LDBL_*, replace them with FLT128_* (full list http://gcc.gnu.org/onlinedocs/libquadmath/Typedef-and-constants.html#Typedef-and-constants)
    • If you need some specific constants like pi (M_PI) or e (M_E) with quadruple precision, use predefined constants with q suffix (M_*q), like M_PIq and M_Eq (full list http://gcc.gnu.org/onlinedocs/libquadmath/Typedef-and-constants.html#Typedef-and-constants)
    • User-defined constants may be written with Q suffix, like 1.3000011111111Q
    • All math function calls should be replaced with *q versions, like sqrtq(), sinq() (full list http://gcc.gnu.org/onlinedocs/libquadmath/Math-Library-Routines.html#Math-Library-Routines)
    • Reading quad-float from string should be done with __float128 strtoflt128 (const char *s, char **sp) - http://gcc.gnu.org/onlinedocs/libquadmath/strtoflt128.html#strtoflt128 (Warning, in older libquadmaths there may be some bugs in strtoflt128, do a double check)
    • Printing the __float128 is done with help of quadmath_snprintf function. On linux distributions with recent glibc the function will be automagically registered by libquadmath to handle Q (may be also q) length modifier of a, A, e, E, f, F, g, G conversion specifiers in all printfs/sprintfs, like it did L for long doubles. Example: printf ("%Qe", 1.2Q), http://gcc.gnu.org/onlinedocs/libquadmath/quadmath_005fsnprintf.html#quadmath_005fsnprintf

    You should also know, that since 4.6 Gfortran will use __float128 type for DOUBLE PRECISION, if the option -fdefault-real-8 was given and there were no option -fdefault-double-8. This may be problem, since 128 long double is much slower than standard long double on many platforms due to software computation. (Thanks to post by glennglockwood http://glennklockwood.blogspot.com/2014/02/linux-perf-libquadmath-and-gfortrans.html)

    0 讨论(0)
提交回复
热议问题