I found some results difficult to understand when trying to debug my neural network. I tried to do some computations offline using scipy
(1.3.0), and I am not havin
I don't know for certain without reading the source code for these two libraries, but there is more than one straightforward way to write a convolution algorithm, and evidently these two libraries implement it in different ways.
One way is to "paint" the kernel onto the output, for each pixel of the image:
from itertools import product
def convolve_paint(img, ker):
img_w, img_h = len(img[0]), len(img)
ker_w, ker_h = len(ker[0]), len(ker)
out_w, out_h = img_w + ker_w - 1, img_h + ker_h - 1
out = [[0]*out_w for i in range(out_h)]
for x,y in product(range(img_w), range(img_h)):
for dx,dy in product(range(ker_w), range(ker_h)):
out[y+dy][x+dx] += img[y][x] * ker[dy][dx]
return out
Another way is to "sum" the contributing amounts at each pixel in the output:
def convolve_sum(img, ker):
img_w, img_h = len(img[0]), len(img)
ker_w, ker_h = len(ker[0]), len(ker)
out_w, out_h = img_w + ker_w - 1, img_h + ker_h - 1
out = [[0]*out_w for i in range(out_h)]
for x,y in product(range(out_w), range(out_h)):
for dx,dy in product(range(ker_w), range(ker_h)):
if 0 <= y-dy < img_h and 0 <= x-dx < img_w:
out[y][x] += img[y-dy][x-dx] * ker[dy][dx]
return out
These two functions produce the same output. However, notice that the second one has y-dy
and x-dx
instead of y+dy
and x+dx
. If the second algorithm is written with +
instead of -
, as might seem natural, then the results will be as if the kernel is rotated by 180 degrees, which is as you've observed.
It's unlikely that either library uses such a simple algorithm to do convolution. For larger images and kernels it's more efficient to use a Fourier transform, applying the convolution theorem. But the difference between the two libraries is likely to be caused by something similar to this.