I have a question regarding numpys memory views:
Suppose we have two arrays with memory:
import numpy as np
import gc
x = np.arange(4*3).reshape(4,3).astype(float)
y = (np.arange(5) - 5).astype(float)
y_ref = y
We use these (x
, y
) in a framework, such that we cannot just redefine them, as the user may have linked them for himself (as in y_ref
). Now we want to combine their memory in one view. So, that the single view, say p
shares the memory with both arrays.
I did it in the following way, but do not know if this causes a memory leak:
p = np.empty(x.size+y.size, dtype=float) # create new memory block with right size
c = 0 # current point in memory
# x
p[c:c+x.size].flat = x.flat # set the memory for combined array p
x.data = p[c:c+x.size].data # now set the buffer of x to be the right length buffer of p
c += x.size
# y
p[c:c+y.size].flat = y.flat # set the memory for combined array p
y.data = p[c:c+y.size].data # and set the buffer of x to be the right length buffer of p
Thus, we can now operate on the single view p
or either of the arrays, without having to redifine every single reference to them
x[3] = 10
print p[3*3:4*3]
# [ 10. 10. 10.]
Even y_ref
has got the update:
print y[0] # -5
y_ref[0] = 100
print p[x.size] # 100
Is this the correct way of setting the memory of an array to be a view into another array?
Is there an obvious way of unifying the memory of arrays, which I am blatantly missing?
I am not sure what will happen with the old data buffers of x
and y
as they are out of scope now. Will they get deallocated?
Update thanks @Jaime:
p.size
can get very large (into billions) on datasets I am applying to (microbiology). Also, this theme gets used in a framework with potentially deep structures, so updating all local versions can get expensive. Updating of all parameters need to be done in an optimization loop, so it is crucial to have everything in memory.
Actually your approach was what I came from in the first place, as it was inefficient using python hierarchy traversals to update all local copies.
According to the source code, the old data buffer will be freed.
but if the old buffer is referenced by other array, it will cause problem:
import numpy as np
a = np.zeros(10)
b = np.zeros(10)
c = a[:]
a.data = b
print c
来源:https://stackoverflow.com/questions/23650796/numpy-set-array-memory