Efficiently calculating boundary-adapted neighbourhood average

こ雲淡風輕ζ 提交于 2019-12-21 04:50:10

问题


I have an image with values ranging from 0 to 1. What I like to do is simple averaging.
But, more specifically, for a cell at the border of the image I'd like to compute the average of the pixels for that part of the neighbourhood/kernel that lies within the extent of the image. In fact this boils down to adapt the denominator of the 'mean formula', the number of pixels you divide the sum by.

I managed to do this as shown below with scipy.ndimage.generic_filter, but this is far from time-efficient.

def fnc(buffer, count):
    n = float(sum(buffer < 2.0))
    sum = sum(buffer) - ((count - b) * 2.0)
    return (sum / n)

avg = scipy.ndimage.generic_filter(image, fnc, footprint = kernel, \
                                   mode = 'constant', cval = 2.0,   \
                                   extra_keywords = {'count': countkernel})

Details

  • kernel = square array (circle represented by ones)
  • Padding with 2's and not by zeroes since then I could not properly separate zeroes of the padded area and zeroes of the actual raster
  • countkernel = number of ones in the kernel
  • n = number of cells that lie within image by excluding the cells of the padded area identified by values of 2
  • Correct the sum by subtracting (number of padded cells * 2.0) from the original neighbourhood total sum

Update(s)

1) Padding with NaNs increases the calculation with about 30%:

    def fnc(buffer):
        return (numpy.nansum(buffer) / numpy.sum([~numpy.isnan(buffer)]))

    avg = scipy.ndimage.generic_filter(image, fnc, footprint = kernel, \
                                       mode = 'constant', cval = float(numpy.nan)

2) Applying the solution proposed by Yves Daoust (accepted answer), definitely reduces the processing time to a minimum:

    def fnc(buffer):
        return numpy.sum(buffer)

    sumbigimage = scipy.ndimage.generic_filter(image, fnc, \
                                               footprint = kernel, \
                                               mode = 'constant', \
                                               cval = 0.0)
    summask     = scipy.ndimage.generic_filter(mask, fnc, \
                                               footprint = kernel, \
                                               mode = 'constant', \
                                               cval = 0.0)
    avg = sumbigimage / summask

3) Building on Yves' tip to use an additional binary image, which in fact is applying a mask, I stumbled upon the principle of masked arrays. As such only one array has to be processed because a masked array 'blends' the image and mask arrays together.
A small detail about the mask array: instead of filling the inner part (extent of original image) with 1's and filling the outer part (border) with 0's as done in the previous update, you must do vice versa. A 1 in a masked array means 'invalid', a 0 means 'valid'.
This code is even 50% faster then the code supplied in update 2):

    maskedimg = numpy.ma.masked_array(imgarray, mask = maskarray)

    def fnc(buffer):
        return numpy.mean(buffer)

    avg = scipy.ndimage.generic_filter(maskedimg, fnc, footprint = kernel, \
                                       mode = 'constant', cval = 0.0)

--> I must correct myself here!
I must be mistaken during the validation, since after some calculation runs it seemed that scipy.ndimage.<filters> cannot handle masked_arrays in that sense that during the filter operation the mask is not taken into account.
Some other people mentioned this too, like here and here.


The power of an image...

  • grey: extent of image to be processed
  • white: padded area (in my case filled with 2.0's)
  • red shades: extent of kernel
    • dark red: effective neighbourhoud
    • light red: part of neighbourhood to be ignored


How can this rather pragmatical piece of code be changed to improve performance of the calculation?

Many thanks in advance!


回答1:


Unsure if this will help, as I am not proficient in scipy: use an auxiliary image of 1's in the gray area and 0's in the white area (0's too in the source image). Then apply the filter to both images with a simple sum.

There is some hope of a speedup if scipy provides a specialized version of the filter with a built-in function for summing.

This done, you will need to divide both images pixel by pixel.




回答2:


I'm not sure how efficient this is, but I'm using a simpler formulation with nan's that handles both borders and masks.

No mask case:

avg = scipy.ndimage.generic_filter(image, np.nanmean, mode='constant', cval=np.nan, footprint=kernel)

Mask case:

masked_image = np.where(mask, image, np.nan)
avg = scipy.ndimage.generic_filter(masked_image, np.nanmean, mode='constant', cval=np.nan, footprint=kernel)

You can use all numpy the nan functions.



来源:https://stackoverflow.com/questions/10683596/efficiently-calculating-boundary-adapted-neighbourhood-average

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!