Different 2D convolution results between keras and scipy

后端 未结 2 2003
栀梦
栀梦 2021-01-22 21:04

I found some results difficult to understand when trying to debug my neural network. I tried to do some computations offline using scipy (1.3.0), and I am not havin

相关标签:
2条回答
  • 2021-01-22 21:22

    I don't know for certain without reading the source code for these two libraries, but there is more than one straightforward way to write a convolution algorithm, and evidently these two libraries implement it in different ways.

    One way is to "paint" the kernel onto the output, for each pixel of the image:

    from itertools import product
    
    def convolve_paint(img, ker):
        img_w, img_h = len(img[0]), len(img)
        ker_w, ker_h = len(ker[0]), len(ker)
        out_w, out_h = img_w + ker_w - 1, img_h + ker_h - 1
        out = [[0]*out_w for i in range(out_h)]
        for x,y in product(range(img_w), range(img_h)):
            for dx,dy in product(range(ker_w), range(ker_h)):
                out[y+dy][x+dx] += img[y][x] * ker[dy][dx]
        return out
    

    Another way is to "sum" the contributing amounts at each pixel in the output:

    def convolve_sum(img, ker):
        img_w, img_h = len(img[0]), len(img)
        ker_w, ker_h = len(ker[0]), len(ker)
        out_w, out_h = img_w + ker_w - 1, img_h + ker_h - 1
        out = [[0]*out_w for i in range(out_h)]
        for x,y in product(range(out_w), range(out_h)):
            for dx,dy in product(range(ker_w), range(ker_h)):
                if 0 <= y-dy < img_h and 0 <= x-dx < img_w:
                    out[y][x] += img[y-dy][x-dx] * ker[dy][dx]
        return out
    

    These two functions produce the same output. However, notice that the second one has y-dy and x-dx instead of y+dy and x+dx. If the second algorithm is written with + instead of -, as might seem natural, then the results will be as if the kernel is rotated by 180 degrees, which is as you've observed.

    It's unlikely that either library uses such a simple algorithm to do convolution. For larger images and kernels it's more efficient to use a Fourier transform, applying the convolution theorem. But the difference between the two libraries is likely to be caused by something similar to this.

    0 讨论(0)
  • 2021-01-22 21:25

    What is usually called convolution in neural networks (and image processing) is not exactly the mathematical concept of convolution, which is what convolve2d implements, but the similar one of correlation, which is implemented by correlate2d:

    res_scipy = correlate2d(image, kernel.T, mode='same')
    
    0 讨论(0)
提交回复
热议问题