sizeof long double and precision not matching?

Consider the following C code:

#include <stdio.h>
int main(int argc, char* argv[]) 
{
    const long double ld = 0.12345678901234567890123456789012345L;
    printf("%lu %.36Lf\n", sizeof(ld), ld);
    return 0;
}

Compiled with gcc 4.8.1 under Ubuntu x64 13.04, it prints:

16 0.123456789012345678901321800735590983

Which tells me that a long double weights 16 bytes but the decimals seems to be ok only to the 20th place. How is it possible? 16 bytes corresponds to a quad, and a quad would give me between 33 and 36 decimals.

The long double format in your C implementation uses an Intel format with a one-bit sign, a 15-bit exponent, and a 64-bit significand (ten bytes total). The compiler allocates 16 bytes for it, which is wasteful but useful for some things such as alignment. However, the 64 bits provide only log₁₀(2⁶⁴) digits of significance, which is about 20 digits.

Various C implementations of the long double may have variant range and precision. The sizeof hints to the underlying floating point notation, but does not specify it. A long double is not required to have 33 to 36 decimals. It could even have exactly the same representation as a double.

Without hard-coding the precision, but using all the available precision and not overdoing it, recommend:

const long double ld = 0.12345678901234567890123456789012345L;
printf("%.*Le\n", LDBL_DIG + 3, ld);
printf("%.*Le\n", LDBL_DIG + 3, nextafterl(ld, ld*2));

This prints out (on my eclipse intel 64-bit), of course, yours may differ.

1.234567890123456789013e-01
1.234567890123456789081e-01

[Edit]

On review, a +2 is sufficient. Better to use LDBL_DECIMAL_DIG. see Printf width specifier to maintain precision of floating-point value

printf("%.*Le\n", (LDBL_DIG + 3) - 1, ld);
printf("%.*Le\n", LDBL_DECIMAL_DIG - 1, ld);

The format on your computer is indeed the Intel double extended-precision format, 80 bits wide, with 15-bit exponent and 64-bit mantissa.

Only 10 consecutive bytes of the memory are actually used of the storage. Intel manuals (Intel® 64 and IA-32 Architectures Software Developer’s Manual Combined Volumes: 1, 2A, 2B, 2C, 2D, 3A, 3B, 3C, 3D and 4) say the following:

When storing floating-point values in memory, half-precision values are stored in 2 consecutive bytes in memory; single-precision values are stored in 4 consecutive bytes in memory; double-precision values are stored in 8 consecutive bytes; and double extended-precision values are stored in 10 consecutive bytes.

However, the x86 Linux ABIs specify that full 16 bytes are actually consumed. This is possibly because a 10-byte value could only have a fundamental alignment requirement of 2 in arrays, which can cause peculiar issues.

Also, array indexing is easier with multiples of 16.

Most of the time this is a non-issue, as long doubles are usually used to minimize error in intermediate calculations and the result be then truncated to a double.

The sizeof operator returns the size in bytes of the data type. The floating point format types are not really comparable to the byte size of the data type, other that bigger size usually means better precision.

来源：https://stackoverflow.com/questions/17382879/sizeof-long-double-and-precision-not-matching

标签

floating-point

floating-point-precision

quad

long-double