Why does Clang std::ostream write a double that std::istream can't read?

问题

I am using an application that uses std::stringstream to read a matrix of space separated doubles from a text file. The application uses code a little like:

std::ifstream file {"data.dat"};
const auto header = read_header(file);
const auto num_columns = header.size();
std::string line;
while (std::getline(file, line)) {
    std::istringstream ss {line}; 
    double val;
    std::size_t tokens {0};
    while (ss >> val) {
        // do stuff
        ++tokens;
    }
    if (tokens < num_columns) throw std::runtime_error {"Bad data matrix..."};
}

Pretty standard stuff. I diligently wrote some code to make the data matrix (data.dat), using the following method for each data line:

void write_line(const std::vector<double>& data, std::ostream& out)
{
    std::copy(std::cbegin(data), std::prev(std::cend(data)),
              std::ostream_iterator<T> {out, " "});
    out << data.back() << '\n';
}

i.e. using std::ostream. However, I found the application was failing to read my data file using this method (throwing the exception above), in particular it was failing to read 7.0552574226130007e-321.

I wrote the following minimal test case which shows the behaviour:

// iostream_test.cpp

#include <iostream>
#include <string>
#include <sstream>

int main()
{
    constexpr double x {1e-320};
    std::ostringstream oss {};
    oss << x;
    const auto str_x = oss.str();
    std::istringstream iss {str_x};
    double y;
    if (iss >> y) {
        std::cout << y << std::endl;
    } else {
        std::cout << "Nope" << std::endl;
    }
}

I tested this code on LLVM 10.0.0 (clang-1000.11.45.2):

$ clang++ --version
Apple LLVM version 10.0.0 (clang-1000.11.45.2)
Target: x86_64-apple-darwin17.7.0 
$ clang++ -std=c++14 -o iostream_test iostream_test.cpp
$ ./iostream_test
Nope

I also tried compiling with Clang 6.0.1, 6.0.0, 5.0.1, 5.0.0, 4.0.1, and 4.0.0, but got the same result.

Compiling with GCC 8.2.0, the code works as I would expect:

$ g++-8 -std=c++14 -o iostream_test iostream_test.cpp
$ ./iostream_test.cpp
9.99989e-321

Why is there a difference between Clang and GCC? Is this a clang bug, and if not, how should one use C++ streams to write portable floating-point IO?

回答1:

I believe clang is conformant here, if we read the answer to std::stod throws out_of_range error for a string that should be valid it says:

The C++ standard allows conversions of strings to double to report underflow if the result is in the subnormal range even though it is representable.

7.63918•10^-313 is within the range of double, but it is in the subnormal range. The C++ standard says stod calls strtod and then defers to the C standard to define strtod. The C standard indicates that strtod may underflow, about which it says “The result underflows if the magnitude of the mathematical result is so small that the mathematical result cannot be represented, without extraordinary roundoff error, in an object of the specified type.” That is awkward phrasing, but it refers to the rounding errors that occur when subnormal values are encountered. (Subnormal values are subject to larger relative errors than normal values, so their rounding errors might be said to be extraordinary.)

Thus, a C++ implementation is allowed by the C++ standard to underflow for subnormal values even though they are representable.

We can confirm we are relying on strtod from [facet.num.get.virtuals]p3.3.4:

For a double value, the function strtod.

We can test this with this small program (see it live):

void check(const char* p) 
{
  std::string str{p};

    printf( "errno before: %d\n", errno ) ;
    double val = std::strtod(str.c_str(), nullptr);
    printf( "val: %g\n", val ) ;
    printf( "errno after: %d\n", errno ) ;
    printf( "ERANGE value: %d\n", ERANGE ) ;

}

int main()
{
 check("9.99989e-321") ;
}

which the following result:

errno before: 0
val: 9.99989e-321
errno after: 34
ERANGE value: 34

C11 in 7.22.1.3p10 tells us:

The functions return the converted value, if any. If no conversion could be performed, zero is returned. If the correct value overflows and default rounding is in effect (7.12.1), plus or minus HUGE_VAL, HUGE_VALF, or HUGE_VALL is returned (according to the return type and sign of the value), and the value of the macro ERANGE is stored in errno. If the result underflows (7.12.1), the functions return a value whose magnitude is no greater than the smallest normalized positive number in the return type; whether errno acquires the value ERANGE is implementation-defined.

POSIX uses that convention: