I found some results difficult to understand when trying to debug my neural network. I tried to do some computations offline using scipy
(1.3.0), and I am not havin
I don't know for certain without reading the source code for these two libraries, but there is more than one straightforward way to write a convolution algorithm, and evidently these two libraries implement it in different ways.
One way is to "paint" the kernel onto the output, for each pixel of the image:
from itertools import product
def convolve_paint(img, ker):
img_w, img_h = len(img[0]), len(img)
ker_w, ker_h = len(ker[0]), len(ker)
out_w, out_h = img_w + ker_w - 1, img_h + ker_h - 1
out = [[0]*out_w for i in range(out_h)]
for x,y in product(range(img_w), range(img_h)):
for dx,dy in product(range(ker_w), range(ker_h)):
out[y+dy][x+dx] += img[y][x] * ker[dy][dx]
return out
Another way is to "sum" the contributing amounts at each pixel in the output:
def convolve_sum(img, ker):
img_w, img_h = len(img[0]), len(img)
ker_w, ker_h = len(ker[0]), len(ker)
out_w, out_h = img_w + ker_w - 1, img_h + ker_h - 1
out = [[0]*out_w for i in range(out_h)]
for x,y in product(range(out_w), range(out_h)):
for dx,dy in product(range(ker_w), range(ker_h)):
if 0 <= y-dy < img_h and 0 <= x-dx < img_w:
out[y][x] += img[y-dy][x-dx] * ker[dy][dx]
return out
These two functions produce the same output. However, notice that the second one has y-dy
and x-dx
instead of y+dy
and x+dx
. If the second algorithm is written with +
instead of -
, as might seem natural, then the results will be as if the kernel is rotated by 180 degrees, which is as you've observed.
It's unlikely that either library uses such a simple algorithm to do convolution. For larger images and kernels it's more efficient to use a Fourier transform, applying the convolution theorem. But the difference between the two libraries is likely to be caused by something similar to this.
What is usually called convolution in neural networks (and image processing) is not exactly the mathematical concept of convolution, which is what convolve2d implements, but the similar one of correlation, which is implemented by correlate2d:
res_scipy = correlate2d(image, kernel.T, mode='same')