可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效，请关闭广告屏蔽插件后再试):

问题:

I wrote some code in Python which works fine but is very slow; I think due to the for loops. I hope one can speed up the following operations using numpy commands. Let me define the goal.

Let's assume I have a 2D numpy array all_CMs of dimensions row x col. For instance consider a 6x11 array (see drawing below).

I want to calculate the mean for all rows, i.e. sumCM_tilde)
Now, for each row I want to calculate the mean of some selected values, namely all values below a certain threshold by computing their sum and dividing it by the number of all columns (N). If the value is above this defined threshold, the CM_tilde value (mean of the entire row) is added. This value is called CM
Afterwards, the CM value is subtracted from each element in the row

In addition to this I want to have a numpy array or list where all those CM values are listed.

The figure:

The following code is working but very slow (especially if the arrays getting large)

CM_tilde = np.mean(data, axis=1) N = data.shape[1] data_cm = np.zeros(( data.shape[0], data.shape[1], data.shape[2] )) all_CMs = np.zeros(( data.shape[0], data.shape[2])) for frame in range(data.shape[2]):     for row in range(data.shape[0]):         CM=0         for col in range(data.shape[1]):             if data[row, col, frame] < (CM_tilde[row, frame]+threshold):                CM += data[row, col, frame]             else:                CM += CM_tilde[row, frame]         CM = CM/N         all_CMs[row, frame] = CM         # calculate CM corrected value         for col in range(data.shape[1]):             data_cm[row, col, frame] = data[row, col, frame] - CM     print "frame: ", frame return data_cm, all_CMs

Any ideas?

回答1:

It's quite easy to vectorize what you're doing:

import numpy as np  #generate dummy data nrows=6 ncols=11 nframes=3 threshold=0.3 data=np.random.rand(nrows,ncols,nframes)  CM_tilde = np.mean(data, axis=1) N = data.shape[1]  all_CMs2 = np.mean(np.where(data < (CM_tilde[:,None,:]+threshold),data,CM_tilde[:,None,:]),axis=1) data_cm2 = data - all_CMs2[:,None,:]

Comparing this with your originals:

In [684]: (data_cm==data_cm2).all() Out[684]: True  In [685]: (all_CMs==all_CMs2).all() Out[685]: True

The logic is that we work with arrays of size [nrows,ncols,nframes] simultaneously. The main trick is to make use of python's broadcasting, by turning CM_tilde of size [nrows,nframes] into CM_tilde[:,None,:] of size [nrows,1,nframes]. Python will then use the same values for each column, since that is a singleton dimension of this modified CM_tilde.

By using np.where we choose (based on the threshold) whether we want to get the corresponding value of data, or, again, the broadcast value of CM_tilde. A new use of np.mean allows us to compute all_CMs2.

In the final step we made use of broadcasting by directly subtracting this new all_CMs2 from the corresponding elements of data.

It might help in vectorizing code this way by looking at the implicit indices of your temporary variables. What I mean is that your temporary variable CM lives inside a loop over [nrows,nframes], and its value is reset with each iteration. This means that CM is in effect a quantity CM[row,frame] (later explicitly assigned to the 2d array all_CMs), and from here it's easy to see that you can construct it by summing up an appropriate CMtmp[row,col,frames] quantity along its column dimension. If it helps, you can name the np.where(...) part as CMtmp for this purpose, and then compute np.mean(CMtmp,axis=1) from that. Same result, obviously, but probably more transparent.

回答2:

Here's my vectorization of your function. I worked from the inside out, and commented out earlier versions as I went along. So the first loop that I vectorized has ### comment marks.

It isn't as clean and well reasoned as @Andras's answer, but hopefully it is instructional, giving an idea of how you can address this issue incrementally.

def foo2(data, threshold):     CM_tilde = np.mean(data, axis=1)     N = data.shape[1]     #data_cm = np.zeros(( data.shape[0], data.shape[1], data.shape[2] ))     ##all_CMs = np.zeros(( data.shape[0], data.shape[2]))     bmask = data < (CM_tilde[:,None,:] + threshold)     CM = np.zeros_like(data)     CM[:] = CM_tilde[:,None,:]     CM[bmask] = data[bmask]     CM = CM.sum(axis=1)     CM = CM/N     all_CMs = CM.copy()     """     for frame in range(data.shape[2]):         for row in range(data.shape[0]):             ###print(frame, row)             ###mask = data[row, :, frame] < (CM_tilde[row, frame]+threshold)             ###print(mask)             ##mask = bmask[row,:,frame]             ##CM = data[row, mask, frame].sum()             ##CM += (CM_tilde[row, frame]*(~mask)).sum()              ##CM = CM/N             ##all_CMs[row, frame] = CM             ## calculate CM corrected value             #for col in range(data.shape[1]):             #    data_cm[row, col, frame] = data[row, col, frame] - CM[row,frame]         print "frame: ", frame     """     data_cm = data - CM[:,None,:]     return data_cm, all_CMs

Output matches for this small test case, which more than anything helped me get the dimensions right.

threshold = .1 data = np.arange(4*3*2,dtype=float).reshape(4,3,2)

文章来源: numpy vectorization instead of for loops

标签