Why does `script.py <(cat *.gz)` work with subprocess.Popen in python 2 but not python 3?

一世执手 提交于 2021-01-29 03:20:48

问题


We discovered recently that a script we developed chokes in python 3.x (but not python 2.x) if it is supplied its input files via process substitution, e.g.:

script.py <(cat *.gz)

We've tested with commands other than gzip, such as cat, just to see if we get a similar error. They all complain that /dev/fd/63 (or /dev/fd/63.gz) does not exist. Here's the (simplified) relevant bit of code:

def open_gzip_in(infile):
    '''Opens a gzip file for reading, using external gzip if available'''

    # Determine whether to use the gzip command line tool or not
    if exeExists('gzip'):
        cmd = ['gzip', '-dc', infile]
        p = subprocess.Popen(cmd, stdout=subprocess.PIPE, bufsize=-1,
                             universal_newlines=True)
        if sys.version.startswith("2"):
            with p.stdout:
                for line in iter(p.stdout.readline, b''):
                    yield line
        else:
            with p:
                for line in p.stdout:
                    yield line
        exit_code = p.wait()
        if exit_code != 0:
            raise subprocess.CalledProcessError(
                p.returncode, subprocess.list2cmdline(cmd), 'Ungzip failed')
    else:
        with io.TextIOWrapper(io.BufferedReader(gzip.open(infile))) as f:
            for line in f:
                yield(line)

Incidentally, we do the fork simply because the command line gzip is significantly faster than using gzip.open and our script is a long-running worker - the difference is multiple hours.

We are implementing a work-around for this issue, but would like to understand why it doesn't work in python 3 but does work in python 2.


回答1:


This is a side effect of the new default Popen()-family argument close_fds=True. You can explicitly override it with close_fds=False, and your inherited file descriptors will be passed through to the child process (subject to configuration via os.set_inheritable()).

Similarly, on Python 3.2 and later, you can use the pass_fds list, as in, pass_fds=[0,1,2,63], to make stdin, stdout, stderr, and FD #63 available to the subprocess invoked.



来源:https://stackoverflow.com/questions/57098553/why-does-script-py-cat-gz-work-with-subprocess-popen-in-python-2-but-not

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!