How do I implement a FIFO buffer to which I can efficiently add arbitrarily sized chunks of bytes to the head and from which I can efficiently pop arbitrarily sized chunks of by
... but removing bytes from the beginning is very slow, because a new StringIO object, that holds a copy of the entire previous buffer minus the first chunk of bytes, must be created.
This type of slowness can be overcome by using bytearray
in Python>=v3.4.
See discussion in this issue and the patch is here.
The key is: removing head byte(s) from bytearray
by
a[:1] = b'' # O(1) (amortized)
is much faster than
a = a[1:] # O(len(a))
when len(a)
is huge (say 10**6).
The bytearray
also provides you a convenient way to preview the whole data set as an array (i.e. itself), in contrast to deque container which needs to join objects into a chunk.
Now an efficient FIFO can be implemented as follow
class byteFIFO:
""" byte FIFO buffer """
def __init__(self):
self._buf = bytearray()
def put(self, data):
self._buf.extend(data)
def get(self, size):
data = self._buf[:size]
# The fast delete syntax
self._buf[:size] = b''
return data
def peek(self, size):
return self._buf[:size]
def getvalue(self):
# peek with no copy
return self._buf
def __len__(self):
return len(self._buf)
Benchmark
import time
bfifo = byteFIFO()
bfifo.put(b'a'*1000000) # a very long array
t0 = time.time()
for k in range(1000000):
d = bfifo.get(4) # "pop" from head
bfifo.put(d) # "push" in tail
print('t = ', time.time()-t0) # t = 0.897 on my machine
The circular/ring buffer implementation in Cameron's answer needs 2.378 sec, and his/her original implementation needs 1.108 sec.