问题
I'm trying to parse in real time the output of a program block-buffered, which means that output is not available until the process ends. What I need is just to parse line by line, filter and manage data from the output, as it could run for hours.
I've tried to capture the output with subprocess.Popen(), but yes, as you may guess, Popen can't manage this kind of behavior, it keeps buffering until end of process.
from subprocess import Popen, PIPE
p = Popen("my noisy stuff ", shell=True, stdout=PIPE, stderr=PIPE)
for line in p.stdout.readlines():
#parsing text and getting data
So I found pexpect, which prints the output in real time, as it treats the stdout as a file, or I could even do a dirty trick printing out a file and parsing it outside the function. But ok, it is too dirty, even for me ;)
import pexpect
import sys
pexpect.run("my noisy stuff", logfile=sys.stdout)
But I guess it should a better pythonic way to do this, just manage the stdout like subprocess. Popen does. How can I do this?
EDIT:
Running J.F. proposal:
This is a deliberately wrong audit, it takes about 25 secs. to stop.
from subprocess import Popen, PIPE
command = "bully mon0 -e ESSID -c 8 -b aa:bb:cc:dd:ee:00 -v 2"
p = Popen(command, shell=True, stdout=PIPE, stderr=PIPE)
for line in iter(p.stdout.readline, b''):
print "inside loop"
print line
print "outside loop"
p.stdout.close()
p.wait()
#$ sudo python SCRIPT.py
### <= 25 secs later......
# inside loop
#[!] Bully v1.0-21 - WPS vulnerability assessment utility
#inside loop
#[!] Using 'ee:cc:bb:aa:bb:ee' for the source MAC address
#inside loop
#[X] Unable to get a beacon from the AP, possible causes are
#inside loop
#[.] an invalid --bssid or -essid was provided,
#inside loop
#[.] the access point isn't on channel '8',
#inside loop
#[.] you aren't close enough to the access point.
#outside loop
Using this method instead: EDIT: Due to large delays and timeouts in the output, I had to fix the child, and added some hacks, so final code looks like this
import pexpect
child = pexpect.spawn(command)
child.maxsize = 1 #Turns off buffering
child.timeout = 50 # default is 30, insufficient for me. Crashes were due to this param.
for line in child:
print line,
child.close()
Gives back the same output, but it prints lines in real time. So... SOLVED Thanks @J.F. Sebastian
回答1:
.readlines()
reads all lines. No wonder you don't see any output until the subprocess ends. You could use .readline()
instead to read line by line as soon as the subprocess flushes its stdout buffer:
from subprocess import Popen, PIPE
p = Popen("my noisy stuff", stdout=PIPE, bufsize=1)
for line in iter(p.stdout.readline, b''):
# process line
..
p.stdout.close()
p.wait()
If you are already have pexpect
then you could use it to workaround the block-buffering issue:
import pexpect
child = pexpect.spawn("my noisy stuff", timeout=None)
for line in child:
# process line
..
child.close()
See also stdbuf, pty -based solutions from the question I've linked in the comments.
来源:https://stackoverflow.com/questions/20182827/parsing-pexpect-output