问题
I have created a 3D median filter which does work and is the following:
def Median_Filter_3D(image,kernel):
window = np.zeros(shape=(kernel,kernel,kernel), dtype = np.uint8)
n = (kernel-1)/2 #Deals with Image border
imgout = np.empty_like(image)
w,h,l = image.shape()
%%Start Loop over each pixel
for y in np.arange(0,(w-n*2),1):
for x in np.arange(0,(h-n*2),1):
for z in np.arange(0,(l-n*2),1):
window[:,:,:] = image[x:x+kernel,y:y+kernel,z:z+kernel]
med = np.median(window)
imgout[x+n,y+n,z+n] = med
return(imgout)
So at every pixel, It creates a window of size kernelxkernelxkernel, finds the median value of the pixels in the window, and replaces the value of that pixel with the new medium value.
My problem is, its very slow, I have thousands of big images to process. There must be a faster way to iterate through all these pixels and still be able to get the same result.
Thanks in advance!!
回答1:
First, looping a 3D matrix in python is a very very very bad idea. In order to loop a large 3D matrix you are better of going down to Cython or C/C++/Fortran and creating a python extension. However, for this particular case, scipy already contains an implementation of the median filter for n-dimensional arrays:
>>> from scipy.ndimage import median_filter
>>> median_filter(my_large_3d_array, radious)
In short, there is no a faster way of iterating through voxels in python (maybe numpy iterators would help a bit, but won't increase the performance considerably). If you need to perform more complicated 3D stuff in python, you should consider programming in Cython the loopy interface or, alternatively, using a chunking library such as Dask, which implements parallel operations for chunks of arrays.
The problem with Python if that for
loops are extremely slow, specially if they are nested and with large arrays. Thus, there is no a standard pythonic method for obtaining efficient iterations over arrays. Usually, the way of getting speed-ups is through vectorized operations and numpy-ticks, but those are very problem-specific and there is no generic trick, you will learn a lot of numpy tricks here in SO.
As a generic approach, if you really need to iterate over arrays, you can write your code in Cython. Cython is a C-like extension for Python. You write code in Python syntax, but specifying variable types (like in C, with int
or float
. That code is then compiled automatically to C and can be called from python. A quick example:
Example Python loopy function:
import numpy as np
def iter_A(A):
B = np.empty(A.shape, dtype=np.float64)
for i in range(A.shape[0]):
for j in range(A.shape[1]):
B[i, j] = A[i, j] * 2
return B
I know that the above code is kinda redundant and could be written as B = A * 2
, but its purpose is just to illustrate that python loops are extremely slow.
Cython version of the function:
import numpy as np
cimport numpy as np
def iter_A_cy(double[:, ::1] A):
cdef Py_ssize_t H = A.shape[0], W = A.shape[1]
cdef double[:, ::1] B = np.empty((H, W), dtype=np.float64)
cdef Py_ssize_t i, j
for i in range(H):
for j in range(W):
B[i, j] = A[i, j] * 2
return np.asarray(B)
Test speeds of both implementations:
>>> import numpy as np
>>> A = np.random.randn(1000, 1000)
>>> %timeit iter_A(A)
1 loop, best of 3: 399 ms per loop
>>> %timeit iter_A_cy(A)
100 loops, best of 3: 2.11 ms per loop
NOTE: you cannot run the Cython function as it is. You need to put it in a separate file and compile it first (or use %%cython
magic in IPython Notebook).
It shows that the raw python version took 400ms
to iterate the whole array, while it was only 2ms
for the Cython version (x200
speedup).
来源:https://stackoverflow.com/questions/36353262/i-need-a-fast-way-to-loop-through-pixels-of-an-image-stack-in-python