Alternatives to Python Popen.communicate() memory limitations?

后端 未结 2 401
感动是毒
感动是毒 2020-12-17 09:46

I have the following chunk of Python code (running v2.7) that results in MemoryError exceptions being thrown when I work with large (several GB) files:

相关标签:
2条回答
  • 2020-12-17 10:21

    I think I found a solution:

    myProcess = Popen(myCmd, shell=True, stdout=PIPE, stderr=PIPE)
    for ln in myProcess.stdout:
        sys.stdout.write(ln)
    for ln in myProcess.stderr:
        sys.stderr.write(ln)
    

    This seems to get my memory usage down enough to get through the task.

    Update

    I have recently found a more flexible way of handing data streams in Python, using threads. It's interesting that Python is so poor at something that shell scripts can do so easily!

    0 讨论(0)
  • 2020-12-17 10:34

    What I would probably do instead, if I needed to read the stdout for something that large, is send it to a file on creation of the process.

    with open(my_large_output_path, 'w') as fo:
        with open(my_large_error_path, 'w') as fe:
            myProcess = Popen(myCmd, shell=True, stdout=fo, stderr=fe)
    

    Edit: If you need to stream, you could try making a file-like object and passing it to stdout and stderr. (I haven't tried this, though.) You could then read (query) from the object as it's being written.

    0 讨论(0)
提交回复
热议问题