问题
I'm developing a client which will receive the [EEG] data over tcp and write it to the ring buffer. I thought it can be very convenient to have the buffer as a ctypes or numpy array because it's possible to create a numpy 'view' to any location of such buffer and read/write/process the data without any copying operations. Or is it a bad idea in general?
However, I don't see how to implement a circular buffer of a fixed size this way. Suppose I have created a buffer object which is contiguous in memory. What is the best way to write the data when the end of the buffer is reached?
One possible way is to start overwriting the (already old) bytes from the begining when the write pointer reaches the end of the buffer array. Near the boundaries, however, the numpy view of some chunk (for processing) can't be created (or can it?) in this case, because some of it can still be located in the end of the buffer array while another already in its begining. I've read it's impossible to create such circular slices. How to solve this?
UPD: Thanks everybody for the answers. In case somebody also faces the same problem, here's the final code I've got.
回答1:
If you need a window of N bytes, make your buffer 2*N bytes and write all input to two locations: i % N
and i % N + N
, where i
is a byte counter. That way you always have N consecutive bytes in the buffer.
data = 'Data to buffer'
N = 4
buf = 2*N*['\00']
for i,c in enumerate(data):
j = i % N
buf[j] = c
buf[j+N] = c
if i >= N-1:
print ''.join(buf[j+1:j+N+1])
prints
Data
ata
ta t
a to
to
to b
o bu
buf
buff
uffe
ffer
回答2:
One possible way is to start overwriting the (already old) bytes from the begining when the write pointer reaches the end of the buffer array.
That's the only option in a fixed-size ring buffer.
I've read it's impossible to create such circular slices.
Which is why I wouldn't do this with a Numpy view. You can create a class
wrapper around an ndarray
instead, holding the buffer/array, the capacity and a pointer (index) to the insertion point. If you want to get the contents as a Numpy array, you'll have to make a copy like so:
buf = np.array([1,2,3,4])
indices = [3,0,1,2]
contents = buf[indices] # copy
You can still set elements' values in-place if you implement __setitem__
and __setslice__
.
回答3:
I think you need to take a step back from C-style thinking here. Updating a ringbuffer for every single insertion is never going to be efficient. A ring-buffer is fundamentally different from the contiguous memory block interface that numpy arrays demand; including the fft you mention you want to do.
A natural solution is to sacrifice a little bit of memory for the sake of performance. For instance, if the number of elements you need to hold in your buffer is N, allocate an array of N+1024 (or some sensible number). Then you only need to move N elements around every 1024 insertions, and you always have a contiguous view of N elements to act upon directly available.
EDIT: here is a code snippet that implements the above, and should give good performance. Note though, that you would be well advised to append in chunks, rather than per element. Otherwise, the performance advantages of using numpy are quickly nullified, regardless of how you implement your ringbuffer.
import numpy as np
class RingBuffer(object):
def __init__(self, size, padding=None):
self.size = size
self.padding = size if padding is None else padding
self.buffer = np.zeros(self.size+self.padding)
self.counter = 0
def append(self, data):
"""this is an O(n) operation"""
data = data[-self.padding:]
n = len(data)
if self.remaining < n: self.compact()
self.buffer[self.counter+self.size:][:n] = data
self.counter += n
@property
def remaining(self):
return self.padding-self.counter
@property
def view(self):
"""this is always an O(1) operation"""
return self.buffer[self.counter:][:self.size]
def compact(self):
"""
note: only when this function is called, is an O(size) performance hit incurred,
and this cost is amortized over the whole padding space
"""
print 'compacting'
self.buffer[:self.size] = self.view
self.counter = 0
rb = RingBuffer(10)
for i in range(4):
rb.append([1,2,3])
print rb.view
rb.append(np.arange(15))
print rb.view #test overflow
回答4:
A variant of @Janne Karila's answer, for C but not numpy:
If the ring buffer is very wide, like N x 1G, then instead of doubling the whole thing,
double up an array of 2*N pointers to its rows.
E.g. for N=3, initialize
bufp = { buf[0], buf[1], buf[2], buf[0], buf[1], buf[2] };
Then you write data only once, and anyfunc( bufp[j:j+3] )
sees the rows in buf
in time order.
来源:https://stackoverflow.com/questions/8908998/ring-buffer-with-numpy-ctypes