Finding matching submatrices inside a matrix

前端 未结 2 1725
慢半拍i
慢半拍i 2021-02-14 20:19

I have a 100x200 2D array expressed as a numpy array consisting of black (0) and white (255) cells. It is a bitmap file. I then have 2D shapes (it\'s easiest to think of them as

相关标签:
2条回答
  • 2021-02-14 21:06

    Here is a method you may be able to use, or adapt, depending upon the details of your requirements. It uses ndimage.label and ndimage.find_objects:

    1. label the image using ndimage.label this finds all blobs in the array and labels them to integers.
    2. Get the slices of these blobs using ndimage.find_objects
    3. Then use set intersection to see if the found blobs correspond with your wanted blobs

    Code for 1. and 2.:

    import scipy
    from scipy import ndimage
    import matplotlib.pyplot as plt
    
    #flatten to ensure greyscale.
    im = scipy.misc.imread('letters.png',flatten=1)
    objects, number_of_objects = ndimage.label(im)
    letters = ndimage.find_objects(objects)
    
    #to save the images for illustrative purposes only:
    plt.imsave('ob.png',objects)
    for i,j in enumerate(letters):
        plt.imsave('ob'+str(i)+'.png',objects[j])
    

    example input:

    enter image description here

    labelled:

    enter image description here

    isolated blobs to test against:

    enter image description here enter image description here enter image description here enter image description here enter image description here enter image description here

    0 讨论(0)
  • 2021-02-14 21:16

    You can use correlate. You'll need to set your black values to -1 and your white values to 1 (or vice-versa) so that you know the value of the peak of the correlation, and that it only occurs with the correct letter.

    The following code does what I think you want.

    import numpy
    from scipy import signal
    
    # Set up the inputs
    a = numpy.random.randn(100, 200)
    a[a<0] = 0
    a[a>0] = 255
    
    b = numpy.random.randn(20, 20)
    b[b<0] = 0
    b[b>0] = 255
    
    # put b somewhere in a
    a[37:37+b.shape[0], 84:84+b.shape[1]] = b
    
    # Now the actual solution...
    
    # Set the black values to -1
    a[a==0] = -1
    b[b==0] = -1
    
    # and the white values to 1
    a[a==255] = 1
    b[b==255] = 1
    
    max_peak = numpy.prod(b.shape)
    
    # c will contain max_peak where the overlap is perfect
    c = signal.correlate(a, b, 'valid')
    
    overlaps = numpy.where(c == max_peak)
    
    print overlaps
    

    This outputs (array([37]), array([84])), the locations of the offsets set in the code.

    You will likely find that if your letter size multiplied by your big array size is bigger than roughly Nlog(N), where N is corresponding size of the big array in which you're searching (for each dimension), then you will probably get a speed up by using an fft based algorithm like scipy.signal.fftconvolve (bearing in mind that you'll need to flip each axis of one of the datasets if you're using a convolution rather than a correlation - flipud and fliplr). The only modification would be to assigning c:

    c = signal.fftconvolve(a, numpy.fliplr(numpy.flipud(b)), 'valid')
    

    Comparing the timings on the sizes above:

    In [5]: timeit c = signal.fftconvolve(a, numpy.fliplr(numpy.flipud(b)), 'valid')
    100 loops, best of 3: 6.78 ms per loop
    
    In [6]: timeit c = signal.correlate(a, b, 'valid')
    10 loops, best of 3: 151 ms per loop
    
    0 讨论(0)
提交回复
热议问题