I\'m testing subprocesses pipelines with python. I\'m aware that I can do what the programs below do in python directly, but that\'s not the point. I just want to test the pipel
I found out how to do it.
It is not about threads, and not about select().
When I run the first process (grep
), it creates two low-level file descriptors, one for each pipe. Lets call those a
and b
.
When I run the second process, b
gets passed to cut
sdtin
. But there is a brain-dead default on Popen
- close_fds=False
.
The effect of that is that cut
also inherits a
. So grep
can't die even if I close a
, because stdin is still open on cut
's process (cut
ignores it).
The following code now runs perfectly.
from subprocess import Popen, PIPE
p1 = Popen(["grep", "-v", "not"], stdin=PIPE, stdout=PIPE)
p2 = Popen(["cut", "-c", "1-10"], stdin=p1.stdout, stdout=PIPE, close_fds=True)
p1.stdin.write('Hello World\n')
p1.stdin.close()
result = p2.stdout.read()
assert result == "Hello Worl\n"
close_fds=True
SHOULD BE THE DEFAULT on unix systems. On windows it closes all fds, so it prevents piping.
EDIT:
PS: For people with a similar problem reading this answer: As pooryorick said in a comment, that also could block if data written to p1.stdin
is bigger than the buffers. In that case you should chunk the data into smaller pieces, and use select.select()
to know when to read/write. The code in the question should give a hint on how to implement that.
EDIT2: Found another solution, with more help from pooryorick - instead of using close_fds=True
and close ALL fds, one could close the fd
s that belongs to the first process, when executing the second, and it will work. The closing must be done in the child so the preexec_fn
function from Popen comes very handy to do just that. On executing p2 you can do:
p2 = Popen(cmd2, stdin=p1.stdout, stdout=PIPE, stderr=devnull, preexec_fn=p1.stdin.close)