Reading stdout process in real time

久未见 提交于 2019-12-31 12:31:10

问题


Let's consider this snippet:

from subprocess import Popen, PIPE, CalledProcessError


def execute(cmd):
    with Popen(cmd, shell=True, stdout=PIPE, bufsize=1, universal_newlines=True) as p:
        for line in p.stdout:
            print(line, end='')

    if p.returncode != 0:
        raise CalledProcessError(p.returncode, p.args)

base_cmd = [
    "cmd", "/c", "d:\\virtual_envs\\py362_32\\Scripts\\activate",
    "&&"
]
cmd1 = " ".join(base_cmd + ['python -c "import sys; print(sys.version)"'])
cmd2 = " ".join(base_cmd + ["python -m http.server"])

If I run execute(cmd1) the output will be printed without any problems.

However, If I run execute(cmd2) instead nothing will be printed, why is that and how can I fix it so I could see the http.server's output in real time.

Also, how for line in p.stdout is been evaluated internally? is it some sort of endless loop till reaches stdout eof or something?

This topic has already been addressed few times here in SO but I haven't found yet a windows solution. The above snippet is in fact code from this answer and trying to run http.server from a virtualenv (python3.6.2-32bits on win7)


回答1:


If you want to read continuously from a running subprocess, you have to make that process' output unbuffered. Your subprocess being a Python program, this can be done by passing -u to the interpreter:

python -u -m http.server

This is how it looks on a Windows box.




回答2:


With this code, you can`t see the real-time output because of buffering:

for line in p.stdout:
    print(line, end='')

But if you use p.stdout.readline() it should work:

while True:
  line = p.stdout.readline()
  if not line: break
  print(line, end='')

See corresponding python bug discussion for details

UPD: here you can find almost the same problem with various solutions on stackoverflow.




回答3:


How for line in p.stdout is been evaluated internally? is it some sort of endless loop till reaches stdout eof or something?

p.stdout is a buffer (blocking). When you are reading from an empty buffer, you are blocked until something is written to that buffer. Once something is in it, you get the data and execute the inner part.

Think of how tail -f works on linux: it waits until something is written to the file, and when it does it echo's the new data to the screen. What happens when there is no data? it waits. So when your program gets to this line, it waits for data and process it.

As your code works, but when run as a model not, it has to be related to this somehow. The http.server module probably buffers the output. Try adding -u parameter to Python to run the process as unbuffered:

-u : unbuffered binary stdout and stderr; also PYTHONUNBUFFERED=x see man page for details on internal buffering relating to '-u'

Also, you might want to try change your loop to for line in iter(lambda: p.stdout.read(1), ''):, as this reads 1 byte at a time before processing.


Update: The full loop code is

for line in iter(lambda: p.stdout.read(1), ''):
    sys.stdout.write(line)
    sys.stdout.flush()

Also, you pass your command as a string. Try passing it as a list, with each element in its own slot:

cmd = ['python', '-m', 'http.server', ..]



回答4:


I think the main problem is that http.server somehow is logging the output to stderr, here I have an example with asyncio, reading the data either from stdout or stderr.

My first attempt was to use asyncio, a nice API, which exists in since Python 3.4. Later I found a simpler solution, so you can choose, both of em should work.

asyncio as solution

In the background asyncio is using IOCP - a windows API to async stuff.

# inspired by https://pymotw.com/3/asyncio/subprocesses.html

import asyncio
import sys
import time

if sys.platform == 'win32':
    loop = asyncio.ProactorEventLoop()
    asyncio.set_event_loop(loop)

async def run_webserver():
    buffer = bytearray()

    # start the webserver without buffering (-u) and stderr and stdin as the arguments
    print('launching process')
    proc = await asyncio.create_subprocess_exec(
        sys.executable, '-u', '-mhttp.server',
        stdout=asyncio.subprocess.PIPE,
        stderr=asyncio.subprocess.PIPE
    )

    print('process started {}'.format(proc.pid))
    while 1:
        # wait either for stderr or stdout and loop over the results
        for line in asyncio.as_completed([proc.stderr.readline(), proc.stdout.readline()]):
            print('read {!r}'.format(await line))

event_loop = asyncio.get_event_loop()
try:
    event_loop.run_until_complete(run_df())
finally:
    event_loop.close()

redirecting the from stdout

based on your example this is a really simple solution. It just redirects the stderr to stdout and only stdout is read.

from subprocess import Popen, PIPE, CalledProcessError, run, STDOUT import os

def execute(cmd):
    with Popen(cmd, stdout=PIPE, stderr=STDOUT, bufsize=1) as p:
        while 1:
            print('waiting for a line')
            print(p.stdout.readline())

cmd2 = ["python", "-u", "-m", "http.server"]

execute(cmd2)



回答5:


You could implement the no-buffer behavior at the OS level.

In Linux, you could wrap your existing command line with stdbuf :

stdbuf -i0 -o0 -e0 YOURCOMMAND

Or in Windows, you could wrap your existing command line with winpty:

winpty.exe -Xallow-non-tty -Xplain YOURCOMMAND

I'm not aware of OS-neutral tools for this.



来源:https://stackoverflow.com/questions/46592284/reading-stdout-process-in-real-time

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!