Are IEEE float and double guaranteed to be the same size on any OS?

前端 未结 4 1060
北恋
北恋 2020-12-11 11:00

I\'m working on a OS portable database system. I want our database files to be OS portable so that customers can move their database files to other kinds of OS\'s at their d

相关标签:
4条回答
  • 2020-12-11 11:13

    They are. "float" will be 32 bits, "double" will be 64 bits. The byte ordering might be different; it's exactly the same as with 32 bit and 64 bit integers.

    If you need extended precision: That may or may not be available as "long double". And extended precision uses 80 bits, but "long double" may have additional padding bits.

    0 讨论(0)
  • 2020-12-11 11:18

    std::numeric_limits<T>::is_iec559

    Determines if a given type follows IEC 559, which is another name for IEEE 754.

    This serves as further evidence that IEEE is optional, and offers a way for you to check if it is used or not.

    C++11 N3337 standard draft 18.3.2.4 numeric_limits members:

    static constexpr bool is_iec559;

    56 True if and only if the type adheres to IEC 559 standard. (217)

    57 Meaningful for all floating point types.

    (217) International Electrotechnical Commission standard 559 is the same as IEEE 754.

    Sample code:

    #include <iostream>
    #include <limits>
    
    int main() {
        std::cout << std::numeric_limits<float>::is_iec559 << std::endl;
        std::cout << std::numeric_limits<double>::is_iec559 << std::endl;
        std::cout << std::numeric_limits<long double>::is_iec559 << std::endl;
    }
    

    Outputs:

    1
    1
    1
    

    on Ubuntu 16.04 x86-64.

    __STDC_IEC_559__ is an analogous macro for C: https://stackoverflow.com/a/31967139/895245

    Rationale

    This is an interesting article that describes the rationale behind not fixing sizes, and hot to get around it: http://yosefk.com/blog/consistency-how-to-defeat-the-purpose-of-ieee-floating-point.html

    0 讨论(0)
  • 2020-12-11 11:19

    C++ says almost nothing about the representation of floating point types.

    [basic.fundamental]/8 says (Emphasis mine):

    There are three floating point types: float, double, and long double. The type double provides at least as much precision as float, and the type long double provides at least as much precision as double. The set of values of the type float is a subset of the set of values of the type double; the set of values of the type double is a subset of the set of values of the type long double. The value representation of floating-point types is implementation-defined. Integral and floating types are collectively called arithmetic types. Specializations of the standard template std::numeric_limits (18.3) shall specify the maximum and minimum values of each arithmetic type for an implementation.

    If you just write C++ code using float, double and long double, you have virtually no guarantees, apart from those given in the documentation for your particular compiler, and those that can be implied from std::numeric_limits.

    On the other hand, IEEE 754 provides exact definitions of the behaviour and binary representation of its floating point types. These definitions are not quite enough to guarantee identical behaviour on all IEEE 754 platforms, since (for example) IEEE 754 sometimes allows multiple operations to be folded together when the result would be more precise than performing the two operations separately. This is likely to be unimportant to your specific case, since you just want the files to be portable, and probably do not care quite as much about identical queries creating identical changes to the files on different platforms as you do about identical files being loaded in identical ways on different platforms.

    So the question is: "how do I get a portable IEEE 754 implementation for C++?".

    The answer to this question is somewhat tricky. Most C++ compilers for reasonable platforms will provide at least float and double that approximately match IEEE 754's binary32 and binary64 specifications (although you will need to read the documentation for each individual compiler to be sure).

    Alternatively, you can use a software floating point implementation or wrapper such as FLIP, libgcc's soft-float, SoftFloat, or STREFLOP. These libraries sometimes still make assumptions about the implementation that are not completely portable according to the C++ standard, so use at your own risk.

    0 讨论(0)
  • 2020-12-11 11:28

    --cut-- Nevermind https://stackoverflow.com/a/24157568/2422450 provides a better explanation for the float sizes.

    If you're however thinking about storing these floats in binary data files, do make sure you don't mess up the byte order or endianness. If you're dumping raw floats, some systems store the bytes in a different order, so casting the 4 bytes you just read to a double might give some surprising results.

    0 讨论(0)
提交回复
热议问题