Why does the buffering of std::ifstream “break” std::getline when using LLVM?

对着背影说爱祢 提交于 2020-07-06 07:04:23

问题


I have a simple C++ application which is supposed to read lines from a POSIX named pipe:

#include<iostream>
#include<string>
#include<fstream>

int main() {
    std::ifstream pipe;
    pipe.open("in");

    std::string line;
    while (true) {
        std::getline(pipe, line);
        if (pipe.eof()) {
            break;
        }
        std::cout << line << std::endl;
    }
}

Steps:

  • I create a named pipe: mkfifo in.

  • I compile & run the C++ code using g++ -std=c++11 test.cpp && ./a.out.

  • I feed data to the in pipe:

sleep infinity > in &  # keep pipe open, avoid EOF
echo hey > in
echo cats > in
echo foo > in
kill %1                # this closes the pipe, C++ app stops on EOF

When doing this under Linux, the application successfully displays output after each echo command as expected (g++ 8.2.1).

When trying this whole process on macOS, output is only displayed after closing the pipe (i.e. after kill %1). I started suspecting some sort of buffering issue, so i've tried disabling it like so:

std::ifstream pipe;
pipe.rdbuf()->pubsetbuf(0, 0);
pipe.open("out");

With this change, the application outputs nothing after the first echo, then prints out the first message after the second echo ("hey"), and keeps doing so, alwasy lagging a message behind and displaying the message of the previous echo instead of the one executed. The last message is only displayed after closing the pipe.

I found out that on macOS g++ is basically clang++, as g++ --version yields: "Apple LLVM version 10.0.1 (clang-1001.0.46.3)". After installing the real g++ using Homebrew, the example program works, just like it did on Linux.

I am building a simple IPC library built on named pipes for various reasons, so this working correctly is pretty much a requirement for me at this point.

What is causing this weird behaviour when using LLVM? (update: this is caused by libc++)

Is this a bug?

Is the way this works on g++ guaranteed by the C++ standard in some way?

How could I make this code snippet work properly using clang++?

Update:

This seems to be caused by the libc++ implementation of getline(). Related links:

  • Why does libc++ getline block when reading from pipe, but libstdc++ getline does not?
  • https://bugs.llvm.org/show_bug.cgi?id=23078

The questions still stand though.


回答1:


I have worked around this issue by wrapping POSIX getline() in a simple C API and simply calling that from C++. The code is something like this:

typedef struct pipe_reader {
    FILE* stream;
    char* line_buf;
    size_t buf_size;
} pipe_reader;

pipe_reader new_reader(const char* pipe_path) {
    pipe_reader preader;
    preader.stream = fopen(pipe_path, "r");
    preader.line_buf = NULL;
    preader.buf_size = 0;
    return preader;
}

bool check_reader(const pipe_reader* preader) {
    if (!preader || preader->stream == NULL) {
        return false;
    }
    return true;
}

const char* recv_msg(pipe_reader* preader) {
    if (!check_reader(preader)) {
        return NULL;
    }
    ssize_t read = getline(&preader->line_buf, &preader->buf_size, preader->stream);
    if (read > 0) {
        preader->line_buf[read - 1] = '\0';
        return preader->line_buf;
    }
    return NULL;
}

void close_reader(pipe_reader* preader) {
    if (!check_reader(preader)) {
        return;
    }
    fclose(preader->stream);
    preader->stream = NULL;
    if (preader->line_buf) {
        free(preader->line_buf);
        preader->line_buf = NULL;
    }
}

This works well against libc++ or libstdc++.




回答2:


As discussed separately, a boost::asio solution would be best, but your question is specifically about how getline is blocking, so I will talk to that.

The problem here is that std::ifstream is not really made for a FIFO file type. In the case of getline(), it is trying to do a buffered read, so (in the initial case) it decides the buffer does not have enough data to reach the delimiter ('\n'), calls underflow() on the underlying streambuf, and that does a simple read for a buffer-length amount of data. This works great for files because the file's length at a point in time is a knowable length, so it can return EOF if there's not enough data to fill the buffer, and if there is enough data, it simply returns with the filled buffer. With a FIFO, however, running out of data does not necessarily mean EOF, so it doesn't return until the process that writes to it closes (this is your infinite sleep command that holds it open).

A more typical way to do this is for the writer to open and close the file as it reads and writes. This is obviously a waste of effort when something more functional like poll()/epoll() is available, but I'm answering the question you're asking.



来源:https://stackoverflow.com/questions/55495932/why-does-the-buffering-of-stdifstream-break-stdgetline-when-using-llvm

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!