问题
I have a 2D numpy array of ints 0 and greater, where the values represent region labels. For example,
array([[9, 9, 9, 0, 0, 0, 0, 1, 1, 1],
[9, 9, 9, 9, 0, 7, 1, 1, 1, 1],
[9, 9, 9, 9, 0, 2, 2, 1, 1, 1],
[9, 9, 9, 8, 0, 2, 2, 1, 1, 1],
[9, 9, 9, 8, 0, 2, 2, 2, 1, 1],
[4, 4, 4, 4, 0, 2, 2, 2, 1, 1],
[4, 6, 6, 4, 0, 0, 0, 0, 0, 0],
[4, 6, 6, 4, 0, 0, 0, 0, 0, 0],
[4, 4, 4, 4, 5, 5, 5, 5, 5, 5],
[4, 4, 4, 4, 5, 5, 5, 5, 5, 5]])
I would like the indices equal to 0 (i.e. zero-regions) to take on the value most-common in their neighborhood. The operation would essentially close the zero-regions. I've tried multiple variations of dilation, erosion, grey-closing, and other morphology operations, but I cannot completely eliminate the zero-regions (without awkwardly blending the other regions). A decent approach could be to define a kernel that convolves only over the zeros, and sets the value with the most common label in the filter area. I'm unsure how to implement this though.
回答1:
One vectorized approach is proposed here. Steps are :
Get kernel sized 2D sliding windows, leading to 4D array. We can use
skimage's view_as_windows
to get those as view and thus avoid creating any extra memory for this.Select the windows which are centered at zeros by indexing into the 4D array. This forces a copy. But assuming number of zeros is a relatively smaller number than the total number of elements in input array, this should be okay.
For each of those selected windows, offset each window with a proper offset with the idea of using
np.bincount
to perform counting. Thus, usebincount
and get the max count excluding the zeros. The argmax for the max count should be our guy!
Here's the implementation covering those steps -
from skimage.util import view_as_windows as viewW
def fill_zero_regions(a, kernel_size=3):
hk = kernel_size//2 # half_kernel_size
a4D = viewW(a, (kernel_size,kernel_size))
sliced_a = a[hk:-hk,hk:-hk]
zeros_mask = sliced_a==0
zero_neighs = a4D[zeros_mask].reshape(-1,kernel_size**2)
n = len(zero_neighs) # num_zeros
scale = zero_neighs.max()+1
zno = zero_neighs + scale*np.arange(n)[:,None] # zero_neighs_offsetted
count = np.bincount(zno.ravel(), minlength=n*scale).reshape(n,-1)
modevals = count[:,1:].argmax(1)+1
sliced_a[zeros_mask] = modevals
return a
Sample run -
In [23]: a
Out[23]:
array([[9, 9, 9, 0, 0, 0, 0, 1, 1, 1],
[9, 9, 9, 9, 0, 7, 1, 1, 1, 1],
[9, 9, 9, 9, 0, 2, 2, 1, 1, 1],
[9, 9, 9, 8, 0, 2, 2, 1, 1, 1],
[9, 9, 9, 8, 0, 2, 2, 2, 1, 1],
[4, 4, 4, 4, 0, 2, 2, 2, 1, 1],
[4, 6, 6, 4, 0, 0, 0, 0, 0, 0],
[4, 6, 6, 4, 0, 0, 0, 0, 0, 0],
[4, 4, 4, 4, 5, 5, 5, 5, 5, 5],
[4, 4, 4, 4, 5, 5, 5, 5, 5, 5]])
In [24]: fill_zero_regions(a)
Out[24]:
array([[9, 9, 9, 0, 0, 0, 0, 1, 1, 1],
[9, 9, 9, 9, 9, 7, 1, 1, 1, 1],
[9, 9, 9, 9, 2, 2, 2, 1, 1, 1],
[9, 9, 9, 8, 2, 2, 2, 1, 1, 1],
[9, 9, 9, 8, 2, 2, 2, 2, 1, 1],
[4, 4, 4, 4, 2, 2, 2, 2, 1, 1],
[4, 6, 6, 4, 4, 2, 2, 2, 1, 0],
[4, 6, 6, 4, 4, 5, 5, 5, 5, 0],
[4, 4, 4, 4, 5, 5, 5, 5, 5, 5],
[4, 4, 4, 4, 5, 5, 5, 5, 5, 5]])
As seen, we are not solving for the boundary cases. If needed to do, use a zero-padded array as the input array, something like this : np.pad(a, (k//2,k//2), 'constant')
, with k
as the kernel size (=3
for the sample).
回答2:
A possible solution based on the convolve idea
from scipy import stats
ar = #Your np array
blank = np.zeros(ar.shape)
#Size to search in for mode values
window_size = 3
for x,y in np.array(np.where(ar == 0)).T:
window = ar[max(x-window_size,0):x+window_size,max(0,y-window_size):y+window_size]
oneD = window.flatten()
#fill blank array with modal value
blank[x,y] = stats.mode(oneD[oneD != 0])[0]
#fill in the zeros
print ar + blank
Im not sure if avoiding a loop is possible here
回答3:
Here's a working solution using Numba, which I haven't profiled but should be pretty fast:
import numba
@numba.njit
def nn(arr):
res = arr.copy()
zeros = np.where(arr == 0)
for n in range(len(zeros[0])):
i = zeros[0][n]
j = zeros[1][n]
left = max(i-1, 0)
right = min(i+2, arr.shape[1])
top = max(j-1, 0)
bottom = min(j+2, arr.shape[0])
area = arr[left:right,top:bottom].ravel()
counts = np.bincount(area[area != 0])
res[i,j] = np.argmax(counts)
return res
It produces:
array([[9, 9, 9, 9, 7, 1, 1, 1, 1, 1],
[9, 9, 9, 9, 9, 7, 1, 1, 1, 1],
[9, 9, 9, 9, 2, 2, 2, 1, 1, 1],
[9, 9, 9, 8, 2, 2, 2, 1, 1, 1],
[9, 9, 9, 8, 2, 2, 2, 2, 1, 1],
[4, 4, 4, 4, 2, 2, 2, 2, 1, 1],
[4, 6, 6, 4, 4, 2, 2, 2, 1, 1],
[4, 6, 6, 4, 4, 5, 5, 5, 5, 5],
[4, 4, 4, 4, 5, 5, 5, 5, 5, 5],
[4, 4, 4, 4, 5, 5, 5, 5, 5, 5]])
The kernel size is 3x3 here, as defined by subtracting 1 and adding 2 to i
and j
(adding 2 because Python slices take one-past-the-end, e.g. [0:3] gives you 3 elements). Boundary conditions are handled by min
and max
.
Credit for the bincount idea: https://stackoverflow.com/a/6252400/4323
来源:https://stackoverflow.com/questions/46126409/numpy-filter-to-smooth-out-zero-regions