writing large amount of data to stdin

前端 未结 2 1308
伪装坚强ぢ
伪装坚强ぢ 2020-12-06 23:25

I am writing a large amount of data to stdin.

How do i ensure that it is not blocking?

p=subprocess.Popen([path],stdout=subprocess.PIPE,stdin=subpr         


        
相关标签:
2条回答
  • 2020-12-06 23:53

    You may have to use Popen.communicate().

    If you write a large amount of data to the stdin and during this the child process generates output to stdout then it may become a problem that the stdout buffer of the child becomes full before processing all of your stdin data. The child process blocks on a write to stdout (because you are not reading it) and you are blocked on writing the stdin.

    Popen.communicate() can be used to write stdin and read stdout/stderr at the same time to avoid the previous problem.

    Note: Popen.communicate() is suitable only when the input and output data can fit to your memory (they are not too large).

    Update: If you decide to hack around with threads here is an example parent and child process implementation that you can tailor to suit your needs:

    parent.py:

    #!/usr/bin/env python2
    import os
    import sys
    import subprocess
    import threading
    import Queue
    
    
    class MyStreamingSubprocess(object):
        def __init__(self, *argv):
            self.process = subprocess.Popen(argv, stdin=subprocess.PIPE, stdout=subprocess.PIPE)
            self.stdin_queue = Queue.Queue()
            self.stdout_queue = Queue.Queue()
            self.stdin_thread = threading.Thread(target=self._stdin_writer_thread)
            self.stdout_thread = threading.Thread(target=self._stdout_reader_thread)
            self.stdin_thread.start()
            self.stdout_thread.start()
    
        def process_item(self, item):
            self.stdin_queue.put(item)
            return self.stdout_queue.get()
    
        def terminate(self):
            self.stdin_queue.put(None)
            self.process.terminate()
            self.stdin_thread.join()
            self.stdout_thread.join()
            return self.process.wait()
    
        def _stdin_writer_thread(self):
            while 1:
                item = self.stdin_queue.get()
                if item is None:
                    # signaling the child process that the end of the
                    # input has been reached: some console progs handle
                    # the case when reading from stdin returns empty string
                    self.process.stdin.close()
                    break
                try:
                    self.process.stdin.write(item)
                except IOError:
                    # making sure that the current self.process_item()
                    # call doesn't deadlock
                    self.stdout_queue.put(None)
                    break
    
        def _stdout_reader_thread(self):
            while 1:
                try:
                    output = self.process.stdout.readline()
                except IOError:
                    output = None
                self.stdout_queue.put(output)
                # output is empty string if the process has
                # finished or None if an IOError occurred
                if not output:
                    break
    
    
    if __name__ == '__main__':
        child_script_path = os.path.join(os.path.dirname(__file__), 'child.py')
        process = MyStreamingSubprocess(sys.executable, '-u', child_script_path)
        try:
            while 1:
                item = raw_input('Enter an item to process (leave empty and press ENTER to exit): ')
                if not item:
                    break
                result = process.process_item(item + '\n')
                if result:
                    print('Result: ' + result)
                else:
                    print('Error processing item! Exiting.')
                    break
        finally:
            print('Terminating child process...')
            process.terminate()
            print('Finished.')
    

    child.py:

    #!/usr/bin/env python2
    import sys
    
    while 1:
        item = sys.stdin.readline()
        sys.stdout.write('Processed: ' + item)
    

    Note: IOError is processed on the reader/writer threads to handle the cases where the child process exits/crashes/killed.

    0 讨论(0)
  • 2020-12-07 00:04

    To avoid the deadlock in a portable way, write to the child in a separate thread:

    #!/usr/bin/env python
    from subprocess import Popen, PIPE
    from threading import Thread
    
    def pump_input(pipe, lines):
        with pipe:
            for line in lines:
                pipe.write(line)
    
    p = Popen(path, stdin=PIPE, stdout=PIPE, bufsize=1)
    Thread(target=pump_input, args=[p.stdin, lines]).start()
    with p.stdout:
        for line in iter(p.stdout.readline, b''): # read output
            print line,
    p.wait()
    

    See Python: read streaming input from subprocess.communicate()

    0 讨论(0)
提交回复
热议问题