Python read stream | 易学教程

问题

I need a very inexpensive way of reading a buffer with no terminating string (a stream) in Python. This is what I have, but it wastes a a lot of CPU time and effort. Because it is constantly "trying and catching." I really need a new approach.

Here is a reduced working version of my code:

#! /usr/bin/env/ python
import fcntl, os, sys

if __name__ == "__main__":
    f = open("/dev/urandom", "r")
    fd = f.fileno()
    fl = fcntl.fcntl(fd, fcntl.F_GETFL)
    fcntl.fcntl(fd, fcntl.F_SETFL, fl | os.O_NONBLOCK)

    ready = False
    line = ""
    while True:
        try:
            char = f.read()
            if char == '\r':
                continue
            elif char = '\n':
                ready = True
            else:
                line += char
        except:
            continue
        if ready:
            print line

Don't run this in the terminal. It's simply for illustration. "urandom" will break your terminal because it spits out a lot of random characters that the terminal emulator interprets no matter what (which can change your current shells settings, title, etc). I was reading from a gps connected via usb.

The problem: this uses 100% of the CPU usage when it can. I have tried this:

#! /usr/bin/env/ python
import fcntl, os, sys

if __name__ == "__main__":
    f = open("/dev/urandom", "r")
    fd = f.fileno()
    fl = fcntl.fcntl(fd, fcntl.F_GETFL)
    fcntl.fcntl(fd, fcntl.F_SETFL, fl | os.O_NONBLOCK)

    for line in f.readlines():
        print line

However, I get IOError: [Errno 11] Resource temporarily unavailable. I have tried to use Popen amongst other things. I am at a loss. Can someone please provide a solution (and please explain everything, as I am not a pro, per se). Also, I should note that this is for Unix (particularly Linux, but it must be portable across all versions of Linux).

回答1:

You will want to set your buffering mode to the size of the chunk you want to read when opening the file stream. From python documentation:

io.open(file, mode='r', buffering=-1, encoding=None, errors=None, newline=None, closefd=True)

"buffering is an optional integer used to set the buffering policy. Pass 0 to switch buffering off (only allowed in binary mode), 1 to select line buffering (only usable in text mode), and an integer > 1 to indicate the size of a fixed-size chunk buffer."

You also want to use the readable() method in the while loop, to avoid unnecessary resource consumption.

However, I advise you to use buffered streams such as io.BytesIO or io.BufferedReader

More info in the docs.

回答2:

The simple solutions are the best:

with open('/dev/urandom', 'r') as f:
    for line in f:
        print line.encode('hex')  # Don't mess up my terminal

Or, alterantively

with open('/dev/urandom', 'r') as f:
    for line in iter(f.readline, ''):
        print line.encode('hex')  # Don't mess up my terminal

Notes:

Leave the file descriptor in blocking mode, so the OS can block your process (and save CPU time) when there is no data available.
It is important to use an iterator in the loop. Consider for line in f.readlines():. f.readlines() reads all of the data, puts it all in a list, and returns that list. Since we have infinite data, f.readlines() will never return successfully. In contrast, f returns an iterator -- it only gets as much data as it needs to satisfy the next loop iteration (and just a little more for a performance buffer.)
The first version reads ahead and buffers enough data to print several lines. The second version returns each line immediately. Use the first version if conserving CPU is your primary concern. Use the second if interactive response time is your primary concern.

Demonstration:

$ python x.py  | head -2l
eb99f1b3bf74eead42750c63cb7c16160fa7e21c94b176dc6fd2d6796a1428dc8c5d15f13e3c1d5969cb59317eaba37a97f4719bb3de87919009da013fa06ae738408478bc15c750850744a4edcc27d155749d840680bf3a827aafbe9be84e7c8e2fe5785d2305cbedd76454573ca9261ac9a480f71242baa94e8d4bdf761705a6a0fea1ba2b1502066b2538a62776e9165043e5b7337d45773d009fd06d15ca0d9b51af499c1c9d7684472272a7361751d220848874215bc494456b08910e9815fc533d3545129aad4f3f126dc5341266ca4b85ea949794cacaf16409bcd02263b08613190b3f69caa68a47758345dafb10121cfe6ed6c8098142682aef47d1080bd2e218b571824bf2fa5d0bb5297278be8a9a2f55b554631c99e5f1d9040c5bc2bde9a40c8b6e95fc47be6ea9235243582f2367893d15a1494f732d0346ec6184a366f8035aef9141c638128444b1549a64937697b1a170e648d20f336e352076893fa7265c8fa0f4e2207e87410e53b43a51aa146ac6c2decf274a45a58c4e442aececf28879a3e0b4a1278eac7a4f969b3f74e2f2a2064a55ff112c4c49092366dbaa125703962ec5083d09cdb750c0e1dbe34cadda66709f98ff63faccf0045993137bfaca949686bc395bbafb7cf9b5b3475a0c91bdea8cec4e9ac1a9c96e0b81c1c5f242ae72cdea4c073db0351322f9da31203ea34d1b6f298128435797f4846a53b0733069060680dbc2b44c662c4b685ced5419b65c01df41cc2dd9877dc2a97a965174d508a3c9275d8aee7f2991bbb06ca7e0010b0e5b9468aed12f5d2c9a65091223547b8655211df435ffbf24768d48c7e7cf3cb7225f2c116e94a8602078f2b34dab6852f57708e760f88f4085ec7dade19ed558a539f830adea1b81f46303789224802f1f090ec0ff59e291246f1287672b0035df07c359d2ada48e674622f61c0f456c36d130fb6cf7f529e7c4dfceccc594ba5e812a3250e022eca9576a5a8b31c0be13969841d5a4d52b10a7dc8ddd1cac279500cb66e3b244e7d1e042249fd8adf2a90fa8bee74378d79a3d55c6fcf6cc19aa85ffb078dba23ca88ea6810d4a1c5d98b3b33e68ddd41c881df167c36ab2e1b081849781e08e3a026fbd3755acf9f215e0402cbf1a021300f5c883f86a05d467479172109a8f20f93c8c255915a264463eb113c3e8d07d0cec31aa8c0f978a0e7e65c142e85383befd6679c69edd2c56599f15580bbb356d98cfdf012dbc6d1dd6c0dbcfe6f8235d3d5c015fb94d8cc29afdf5d69e33d0e5078d651782546bc2acccab9f35e595f0951a139526ae5651a3ebbec353e99f9ddd1615ed25529500dabe8bf6f12ee6b21a437caca12a6d9688986d94fb7c103dca1572350900e56276b857630a
9d024ef4454dcd5e35dd605a2d49c26ce44fae87ab33e7a158d328521c7d77969908ec5b67f01bf8e2c330dcb70b5f3def8e6d4b010c6d31e4cbe7478657782f10b6fc2d77e8ff7a2f1e590827827e1037b33b0a
Traceback (most recent call last):
  File "x.py", line 4, in <module>
    print line.encode('hex')  # Don't mess up my terminal
IOError: [Errno 32] Broken pipe

回答3:

I decided to use io. I noticed that this is much more accurate than even a while True:. The gps that I am reading from is supposed to spit out info every second, but I noticed it was really anywhere from .95 to 1.05 secs. That was when I was doing what I posted in my question.

However, when I simply do

#! /usr/bin/env/ python

import io

if __name__ == "__main__":
    f = io.open("/dev/ttyUSB0")
    while True:
        print f.readline().strip()

It not only temporarily blocks (which save cpu time, and does all sorts of good), but it also apparently keeps the buffer extremely up to date because it seems to produce results almost exactly one second apart (which is when my gps - like most - updates).

A true miracle that class is - a true miracle - that is if it were the only way to do it like this. One could just use open(file, "r"), and it works fine (which angers me because I spent quite an entire day on this).

来源：https://stackoverflow.com/questions/26127889/python-read-stream

标签

python

filestream