问题
I am using POSIX named pipes (fifos) to send records from one or more threads to be read by another thread (only one thread does the reading). However, the 83rd record out of 100 records is simply dropped. The client core calls write and the return value is correctly reported as the length of the record (720 bytes) so the client (writer) core confirms that the record is sent, but switching to the reader core in gdb debug mode with scheduler-locking on, I cycle through reading the few previous records and then read fails -- no record in the pipe, even though the client (writer) core confirmed the write.
The pipe capacity is 65,536 bytes (by default in Linux). I assume the pipe contents are reduced by 1 record for each record read, so at the point where the 83rd record is dropped I have about 5 prior records in the pipe, or 3600 bytes -- not enough to fill the pipe.
I opened the pipes in nonblocking mode because when I opened them in blocking mode both ends froze. According to the man pages at http://man7.org/linux/man-pages/man7/fifo.7.html, "The FIFO must be opened on both ends (reading and writing) before data can be passed. Normally, opening the FIFO blocks until the other end is opened also." My problem is that both ends block and won't go further. It also says, "Under Linux, opening a FIFO for read and write will succeed both in blocking and nonblocking mode. POSIX leaves this behavior undefined."
The code on each end is simple:
int64_t fifo_write(int fd, const void *buf, size_t count) {
int status_write = write(fd, buf, count);
return status_write; }
int64_t fifo_read(int fd, void *buf, size_t count) {
int status_read = read(fd, buf, count);
return status_read; }
The C functions are called from my NASM program:
mov rdi,[fifo_read_fd]
lea rsi,[fifo_buffer]
mov rdx,720
call fifo_read wrt ..plt
mov rdi,[fifo_write_fd]
mov rsi,[rbp-24]
mov rdx,720 ; bytes
push r11
push rcx
call fifo_write wrt ..plt
pop rcx
pop r11
My questions are:
What could cause the dropped record? It does not look like pipe capacity unless the pipe is not emptied with the read of each record -- even all 83 records would take 59760 bytes, below the 65K pipe capacity in Linux. It could be due to nonblocking mode, but if the pipe is not full there would be no reason to block.
How can I open both ends in blocking mode (given that both ends freeze, each waiting for the other), and are there any problems would I have with blocking mode?
I could open both ends in read/write mode because my code only writes on from one or more threads on one end and reads from 1 thread (only) on the other end. While "POSIX leaves this behavior undefined" are there any reasons not open both ends in read/write mode in this situation?
I have not posted any other code with this question (except as above) because I'm only looking for ideas on the best way to handle the problem of a dropped record in the case I described.
回答1:
You have multiple writers using one FIFO sending messages of 720 bytes. POSIX only requires writes of PIPE_BUF
(512 bytes, normally) to be atomic. That means that longer writes can get interleaved by writes from other threads and get corrupted.
Regardless of PIPE_BUF
size, pipes are streams and they don't have a notion of a message, and that means you need to delimit messages yourself, which your code doesn't do. In other words, your reader code cannot possibly recover the individual messages when there are multiple writers.
You may like to use a Unix datagram socket instead. Each message into a Unix datagram socket is an atomic message and it gets written and read completely in one syscall (sendto
and recvfrom
).
来源:https://stackoverflow.com/questions/61547609/posix-named-pipe-fifo-drops-record-in-nonblocking-mode