I have a generic problem I am looking to solve, where chunks of binary data sent from a standard input or regular file stream to an application, which in turn converts that bina
I will start by saying that I feel pthreads conditions and mutexes were not really necessary here, nor was non-blocking I/O the best reaction to the problems you describe.
In my opinion, the problems you describe with your condition- and mutex-less version are symptoms of forgetting to close()
assiduously the ends of your pipes, with the result that a copy of the writing-end file descriptor of the pipe feeding the child process's stdin
leaked (into that child or others) alive.
Then, given that a writing-end corresponding to stdin's reading-end still existed, the system did not give EOF but instead blocked indefinitely.
In your case, you did prevent the pipe-end file descriptors from leaking to the spawned child (with the correct close()
calls on the child-side of the fork()
within your popen3()
, although you forgot to close()
the wrong-end pipe ends on the parent-side). However, you did not prevent this leakage to all other children! If you call popen3()
twice, the leakage of the set of three descriptors into the child is prevented, but as the parent still owns them, when the next call to popen3()
happens, after the fork()
there are now 6 file descriptors to close (The old set of three and and the new set of three you just created).
In your case, therefore, you should set the close-on-exec flag on those pipe ends, thusly:
fcntl(fdIn [PIPEWR], F_SETFD, fcntl(fdIn [PIPEWR], F_GETFD) | FD_CLOEXEC);
fcntl(fdOut[PIPERD], F_SETFD, fcntl(fdOut[PIPERD], F_GETFD) | FD_CLOEXEC);
fcntl(fdErr[PIPERD], F_SETFD, fcntl(fdErr[PIPERD], F_GETFD) | FD_CLOEXEC);
Here is code that spawns 6 threads and 3 processes, and passes its input unmodified to the output, after internally compressing then decompressing it. It effectively implements gzip -c - | XOR 0x55 | XOR 0x55 | gunzip -c - | cat
, where:
gzip
by thread srcThrd
.gzip
's output is read by thread a2xor0Thrd
and fed to thread xor0Thrd
.xor0Thrd
XORs its input with 0x55
before passing it on to thread xor1Thrd
.xor1Thrd
XORs its input with 0x55
before passing it on to thread xor22BThrd
.xor22BThrd
feeds its input to process gunzip
.gunzip
feeds its output directly (without going through a thread) to cat
cat
's output is read by thread dstThrd
and printed to standard output.Compression is done by inter-process pipe communication, while XORing is done by intra-process pipe communication. No mutexes or condition variables are used. main()
is extremely easy to understand. This code should be easy to extend to your situation.
/* Includes */
#include
#include
#include
#include
#include
/* Defines */
#define PIPERD 0
#define PIPEWR 1
/* Data structures */
typedef struct PIPESET{
int Ain[2];
int Aout[2];
int Aerr[2];
int xor0[2];
int xor1[2];
int xor2[2];
int Bin[2];
int BoutCin[2];
int Berr[2];
int Cout[2];
int Cerr[2];
} PIPESET;
/* Function Implementations */
/**
* Source thread main method.
*
* Slurps from standard input and feeds process A.
*/
void* srcThrdMain(void* arg){
PIPESET* pipeset = (PIPESET*)arg;
char c;
while(read(0, &c, 1) > 0){
write(pipeset->Ain[PIPEWR], &c, 1);
}
close(pipeset->Ain[PIPEWR]);
pthread_exit(NULL);
}
/**
* A to XOR0 thread main method.
*
* Manually pipes from standard output of process A to input of thread XOR0.
*/
void* a2xor0ThrdMain(void* arg){
PIPESET* pipeset = (PIPESET*)arg;
char buf[65536];
ssize_t bytesRead;
while((bytesRead = read(pipeset->Aout[PIPERD], buf, 65536)) > 0){
write(pipeset->xor0[PIPEWR], buf, bytesRead);
}
close(pipeset->xor0[PIPEWR]);
pthread_exit(NULL);
}
/**
* XOR0 thread main method.
*
* XORs input with 0x55 and outputs to input of XOR1.
*/
void* xor0ThrdMain(void* arg){
PIPESET* pipeset = (PIPESET*)arg;
char c;
while(read(pipeset->xor0[PIPERD], &c, 1) > 0){
c ^= 0x55;
write(pipeset->xor1[PIPEWR], &c, 1);
}
close(pipeset->xor1[PIPEWR]);
pthread_exit(NULL);
}
/**
* XOR1 thread main method.
*
* XORs input with 0x55 and outputs to input of process B.
*/
void* xor1ThrdMain(void* arg){
PIPESET* pipeset = (PIPESET*)arg;
char c;
while(read(pipeset->xor1[PIPERD], &c, 1) > 0){
c ^= 0x55;
write(pipeset->xor2[PIPEWR], &c, 1);
}
close(pipeset->xor2[PIPEWR]);
pthread_exit(NULL);
}
/**
* XOR2 to B thread main method.
*
* Manually pipes from input (output of XOR1) to input of process B.
*/
void* xor22BThrdMain(void* arg){
PIPESET* pipeset = (PIPESET*)arg;
char buf[65536];
ssize_t bytesRead;
while((bytesRead = read(pipeset->xor2[PIPERD], buf, 65536)) > 0){
write(pipeset->Bin[PIPEWR], buf, bytesRead);
}
close(pipeset->Bin[PIPEWR]);
pthread_exit(NULL);
}
/**
* Destination thread main method.
*
* Manually copies the standard output of process C to the standard output.
*/
void* dstThrdMain(void* arg){
PIPESET* pipeset = (PIPESET*)arg;
char c;
while(read(pipeset->Cout[PIPERD], &c, 1) > 0){
write(1, &c, 1);
}
pthread_exit(NULL);
}
/**
* Set close on exec flag on given descriptor.
*/
void setCloExecFlag(int fd){
fcntl(fd, F_SETFD, fcntl(fd, F_GETFD) | FD_CLOEXEC);
}
/**
* Set close on exec flag on given descriptor.
*/
void unsetCloExecFlag(int fd){
fcntl(fd, F_SETFD, fcntl(fd, F_GETFD) & ~FD_CLOEXEC);
}
/**
* Pipe4.
*
* Create a pipe with some ends possibly marked close-on-exec.
*/
#define PIPE4_FLAG_NONE (0U)
#define PIPE4_FLAG_RD_CLOEXEC (1U << 0)
#define PIPE4_FLAG_WR_CLOEXEC (1U << 1)
int pipe4(int fd[2], int flags){
int ret = pipe(fd);
if(flags&PIPE4_FLAG_RD_CLOEXEC){setCloExecFlag(fd[PIPERD]);}
if(flags&PIPE4_FLAG_WR_CLOEXEC){setCloExecFlag(fd[PIPEWR]);}
return ret;
}
/**
* Pipe4 explicit derivatives.
*/
#define pipe4_cloexec(fd) pipe4((fd), PIPE4_FLAG_RD_CLOEXEC|PIPE4_FLAG_WR_CLOEXEC)
/**
* Popen4.
*
* General-case for spawning a process and tethering it with cloexec pipes on stdin,
* stdout and stderr.
*
* @param [in] cmd The command to execute.
* @param [in/out] pin The pointer to the cloexec pipe for stdin.
* @param [in/out] pout The pointer to the cloexec pipe for stdout.
* @param [in/out] perr The pointer to the cloexec pipe for stderr.
* @param [in] flags A bitwise OR of flags to this function. Available
* flags are:
*
* POPEN4_FLAG_NONE:
* Explicitly specify no flags.
* POPEN4_FLAG_NOCLOSE_PARENT_STDIN,
* POPEN4_FLAG_NOCLOSE_PARENT_STDOUT,
* POPEN4_FLAG_NOCLOSE_PARENT_STDERR:
* Don't close pin[PIPERD], pout[PIPEWR] and perr[PIPEWR] in the parent,
* respectively.
* POPEN4_FLAG_CLOSE_CHILD_STDIN,
* POPEN4_FLAG_CLOSE_CHILD_STDOUT,
* POPEN4_FLAG_CLOSE_CHILD_STDERR:
* Close the respective streams in the child. Ignores pin, pout and perr
* entirely. Overrides a NOCLOSE_PARENT flag for the same stream.
*/
#define POPEN4_FLAG_NONE (0U)
#define POPEN4_FLAG_NOCLOSE_PARENT_STDIN (1U << 0)
#define POPEN4_FLAG_NOCLOSE_PARENT_STDOUT (1U << 1)
#define POPEN4_FLAG_NOCLOSE_PARENT_STDERR (1U << 2)
#define POPEN4_FLAG_CLOSE_CHILD_STDIN (1U << 3)
#define POPEN4_FLAG_CLOSE_CHILD_STDOUT (1U << 4)
#define POPEN4_FLAG_CLOSE_CHILD_STDERR (1U << 5)
pid_t popen4(const char* cmd, int pin[2], int pout[2], int perr[2], int flags){
/********************
** FORK PROCESS **
********************/
pid_t ret = fork();
if(ret < 0){
/**
* Error in fork(), still in parent.
*/
fprintf(stderr, "fork() failed!\n");
return ret;
}else if(ret == 0){
/**
* Child-side of fork
*/
if(flags & POPEN4_FLAG_CLOSE_CHILD_STDIN){
close(0);
}else{
unsetCloExecFlag(pin [PIPERD]);
dup2(pin [PIPERD], 0);
}
if(flags & POPEN4_FLAG_CLOSE_CHILD_STDOUT){
close(1);
}else{
unsetCloExecFlag(pout[PIPEWR]);
dup2(pout[PIPEWR], 1);
}
if(flags & POPEN4_FLAG_CLOSE_CHILD_STDERR){
close(2);
}else{
unsetCloExecFlag(perr[PIPEWR]);
dup2(perr[PIPEWR], 2);
}
execl("/bin/sh", "sh", "-c", cmd, NULL);
fprintf(stderr, "exec() failed!\n");
exit(-1);
}else{
/**
* Parent-side of fork
*/
if(~flags & POPEN4_FLAG_NOCLOSE_PARENT_STDIN &&
~flags & POPEN4_FLAG_CLOSE_CHILD_STDIN){
close(pin [PIPERD]);
}
if(~flags & POPEN4_FLAG_NOCLOSE_PARENT_STDOUT &&
~flags & POPEN4_FLAG_CLOSE_CHILD_STDOUT){
close(pout[PIPEWR]);
}
if(~flags & POPEN4_FLAG_NOCLOSE_PARENT_STDERR &&
~flags & POPEN4_FLAG_CLOSE_CHILD_STDERR){
close(perr[PIPEWR]);
}
return ret;
}
/* Unreachable */
return ret;
}
/**
* Main Function.
*
* Sets up the whole piping scheme.
*/
int main(int argc, char* argv[]){
pthread_t srcThrd, a2xor0Thrd, xor0Thrd, xor1Thrd, xor22BThrd, dstThrd;
pid_t gzip, gunzip, cat;
PIPESET pipeset;
pipe4_cloexec(pipeset.Ain);
pipe4_cloexec(pipeset.Aout);
pipe4_cloexec(pipeset.Aerr);
pipe4_cloexec(pipeset.Bin);
pipe4_cloexec(pipeset.BoutCin);
pipe4_cloexec(pipeset.Berr);
pipe4_cloexec(pipeset.Cout);
pipe4_cloexec(pipeset.Cerr);
pipe4_cloexec(pipeset.xor0);
pipe4_cloexec(pipeset.xor1);
pipe4_cloexec(pipeset.xor2);
/* Spawn processes */
gzip = popen4("gzip -c -", pipeset.Ain, pipeset.Aout, pipeset.Aerr, POPEN4_FLAG_NONE);
gunzip = popen4("gunzip -c -", pipeset.Bin, pipeset.BoutCin, pipeset.Berr, POPEN4_FLAG_NONE);
cat = popen4("cat", pipeset.BoutCin, pipeset.Cout, pipeset.Cerr, POPEN4_FLAG_NONE);
/* Spawn threads */
pthread_create(&srcThrd, NULL, srcThrdMain, &pipeset);
pthread_create(&a2xor0Thrd, NULL, a2xor0ThrdMain, &pipeset);
pthread_create(&xor0Thrd, NULL, xor0ThrdMain, &pipeset);
pthread_create(&xor1Thrd, NULL, xor1ThrdMain, &pipeset);
pthread_create(&xor22BThrd, NULL, xor22BThrdMain, &pipeset);
pthread_create(&dstThrd, NULL, dstThrdMain, &pipeset);
pthread_join(srcThrd, (void**)NULL);
pthread_join(a2xor0Thrd, (void**)NULL);
pthread_join(xor0Thrd, (void**)NULL);
pthread_join(xor1Thrd, (void**)NULL);
pthread_join(xor22BThrd, (void**)NULL);
pthread_join(dstThrd, (void**)NULL);
return 0;
}
There are many issues with your code, most of which have nothing to do with threading.
close()
the file descriptor d->gunzip_ptr->in
. This means that gunzip
can never know that no more input is forthcoming on its stdin
, so it will never exit.gunzip
doesn't ever exit, it will never close()
its stdout, and thus a blocking read()
at the other end will never unblock. A non-blocking read will instead always give -1
, with errno == EAGAIN
.popen3()
doesn't close()
p_stdin[POPEN3_READ]
, p_stdout[POPEN3_WRITE]
or p_stderr[POPEN3_WRITE]
on the parent side of the fork()
. Only the child should have those descriptors. Failing to close these means that when the parent itself tries to read the stdout and stderr of the child, it will never see EOF, again for the same reasons as above: Because it itself still owns a write-end pipe in which it could write, making new data appear to the read-end.gunzip
writing out at least one byte for every 1024 you write in. There is no guarantee that this will be the case, since gunzip
may, at its leisure, buffer internally.
BUF_LENGTH_VALUE
bytes into d->in_buf
. You then assign the number of bytes you read through fread()
to d->n_in_bytes
. This same d->n_in_bytes
is used in your write()
call to write to gunzip
's stdin. You then signal for consume_gunzip_chunk()
to wake up, then pthread_cond_wait()
's for the next gzip-compressed chunk. But this gzip-compressed chunk may never come, since there is no guarantee that gunzip
will be able to unpack useful output from just the first 1024 bytes of input, nor even a guarantee that it will write()
it out instead of buffering it until it has, say, 4096 bytes (a full page) of output. Therefore, the read()
call in consume_gunzip_chunk()
may never succeed (or even return, if read()
was blocking). And if read()
never returns, then consume_gunzip_chunk()
doesn't signal d->in_cond
, and so all three threads get stuck. And even if read()
is non-blocking, the last block of output from gzip may never come, since gzip
's input is never closed, so it doesn't flush out its buffers by write()
'ing them out, so read()
on the other end will never get useful data and no amount of pleading will elicit it without a close()
.POSSIBLE (LIKELY?) CAUSE OF BUG: d->n_out_bytes_read_from_gunzip
, once it becomes non-0
, will never become 0
again. This means that the extremely baffling
while (d->n_in_bytes != 0 || d->n_out_bytes_read_from_gunzip != 0)
pthread_cond_wait(&d->in_cond, &d->in_lock);
within produce_gzip_chunk()
will, once entered with d->n_out_bytes_read_from_gunzip != 0
, forever remain stuck. By calling sleep(1)
within consume_gunzip_chunk()
, which sets d->n_out_bytes_read_from_gunzip
, you may have defused the problem by reading all input before consume_gunzip_chunk()
could lock up the system by setting d->n_out_bytes_read_from_gunzip
to a non-zero value.
pthread_cond_wait(&d->in_cond, &d->in_lock);
, these being produce_gzip_chunk()
and consume_gzip_chunk()
. There is absolutely no guarantee that when consume_gunzip_chunk()
calls pthread_cond_signal(&d->in_cond);
, that the "correct" thread (whichever it is in your design) will receive the signal. To ensure that all of them will, use pthread_cond_broadcast()
, but then you expose yourself to the thundering herd problem. Needing to use pthread_cond_broadcast()
in this situation is, again, a symptom of a bad design in my opinion.pthread_cond_signal(&d->in_cond)
within a thread (indeed, a function) in which you call pthread_cond_wait(&d->in_cond, &d->in_lock)
. What purpose does that serve?d->in_lock
for too many disparate purposes, exposing yourself to the possibility of deadlock, or low performance due to excessive protection. In particular you use it as the protection for both d->in_cond
and d->out_cond
. This is too strong a protection – the output of gunzip
into d->in_line
should be able to happen simultaneously with the input of gunzip
being written into and out of d->in_buf
.Within consume_gunzip_chunk()
, you have
while (d->n_in_bytes_written_to_gunzip == 0) {
pthread_cond_wait(&d->out_cond, &d->in_lock);
}
if (d->n_in_bytes_written_to_gunzip) {
...
This if
can never fail! Is there a case you may have in mind?
struct pthread_data
volatile (or at least those integer elements which are used by multiple threads), since the compiler might decide to optimize out loads and stores that should, in fact, remain.So as to not sound too negative, I would like to say that in general your problems were not due to misuse of the pthreads API but due to erroneous consumer-producer logic and lack of close()
s. Additionally, you appear to understand that pthread_cond_wait()
may wake up spuriously, and so you have wrapped it up in a loop that checks the invariant.
I would use pipes, even between threads. This absolves you from needing to implement your own consumer-producer scheme; The kernel has solved it for you already, and provides you with the pipe()
, read()
and write()
primitives, which are all you need to take advantage of this ready-made solution. It also makes the code cleaner and void of mutexes and condition variables. One must simply be diligent in closing the ends, and one must be supremely careful around pipes in the presence of fork()
. The rules are simple:
read()
on an open read-end will not give EOF but will block or EAGAIN
.read()
on an open read-end will give EOF.write()
to any of its write-ends will cause SIGPIPE
.fork()
duplicates the entire process, including all descriptors (modulo maybe crazy stuff in pthread_atfork()
)!