问题
We discovered recently that a script we developed chokes in python 3.x (but not python 2.x) if it is supplied its input files via process substitution, e.g.:
script.py <(cat *.gz)
We've tested with commands other than gzip, such as cat, just to see if we get a similar error. They all complain that /dev/fd/63
(or /dev/fd/63.gz
) does not exist. Here's the (simplified) relevant bit of code:
def open_gzip_in(infile):
'''Opens a gzip file for reading, using external gzip if available'''
# Determine whether to use the gzip command line tool or not
if exeExists('gzip'):
cmd = ['gzip', '-dc', infile]
p = subprocess.Popen(cmd, stdout=subprocess.PIPE, bufsize=-1,
universal_newlines=True)
if sys.version.startswith("2"):
with p.stdout:
for line in iter(p.stdout.readline, b''):
yield line
else:
with p:
for line in p.stdout:
yield line
exit_code = p.wait()
if exit_code != 0:
raise subprocess.CalledProcessError(
p.returncode, subprocess.list2cmdline(cmd), 'Ungzip failed')
else:
with io.TextIOWrapper(io.BufferedReader(gzip.open(infile))) as f:
for line in f:
yield(line)
Incidentally, we do the fork simply because the command line gzip is significantly faster than using gzip.open and our script is a long-running worker - the difference is multiple hours.
We are implementing a work-around for this issue, but would like to understand why it doesn't work in python 3 but does work in python 2.
回答1:
This is a side effect of the new default Popen()
-family argument close_fds=True
. You can explicitly override it with close_fds=False
, and your inherited file descriptors will be passed through to the child process (subject to configuration via os.set_inheritable()
).
Similarly, on Python 3.2 and later, you can use the pass_fds
list, as in, pass_fds=[0,1,2,63]
, to make stdin, stdout, stderr, and FD #63 available to the subprocess invoked.
来源:https://stackoverflow.com/questions/57098553/why-does-script-py-cat-gz-work-with-subprocess-popen-in-python-2-but-not