I know that, in the 1D case, the convolution between two vectors, a
and b
, can be computed as conv(a, b)
, but also as the product betw
Let I be the input signal and F be the filter or kernel.
If the I is m1 x n1 and F is m2 x n2 the size of the output will be:
Zero pad the filter to make it the same size as the output.
Now all these small Toeplitz matrices should be arranged in a big doubly blocked Toeplitz matrix.
This multiplication gives the convolution result.
For more details and python code take a look at my github repository:
Step by step explanation of 2D convolution implemented as matrix multiplication using toeplitz matrices in python
If you unravel k to a m^2 vector and unroll X, you would then get:
m**2
vectork
((n-m)**2, m**2)
matrix for unrolled_X
where unrolled_X
could be obtained by the following Python code:
from numpy import zeros
def unroll_matrix(X, m):
flat_X = X.flatten()
n = X.shape[0]
unrolled_X = zeros(((n - m) ** 2, m**2))
skipped = 0
for i in range(n ** 2):
if (i % n) < n - m and ((i / n) % n) < n - m:
for j in range(m):
for l in range(m):
unrolled_X[i - skipped, j * m + l] = flat_X[i + j * n + l]
else:
skipped += 1
return unrolled_X
Unrolling X and not k allows a more compact representation (smaller matrices) than the other way around for each X - but you need to unroll each X. You could prefer unrolling k depending on what you want to do.
Here, the unrolled_X
is not sparse, whereas unrolled_k
would be sparse, but of size ((n-m+1)^2,n^2)
as @Salvador Dali mentioned.
Unrolling k
could be done like this:
from scipy.sparse import lil_matrix
from numpy import zeros
import scipy
def unroll_kernel(kernel, n, sparse=True):
m = kernel.shape[0]
if sparse:
unrolled_K = lil_matrix(((n - m)**2, n**2))
else:
unrolled_K = zeros(((n - m)**2, n**2))
skipped = 0
for i in range(n ** 2):
if (i % n) < n - m and((i / n) % n) < n - m:
for j in range(m):
for l in range(m):
unrolled_K[i - skipped, i + j * n + l] = kernel[j, l]
else:
skipped += 1
return unrolled_K
The code shown above doesn't produce the unrolled matrix of the right dimensions. The dimension should be (n-k+1)*(m-k+1), (k)(k). k: filter dimension, n: num rows in input matrix, m: num columns.
def unfold_matrix(X, k):
n, m = X.shape[0:2]
xx = zeros(((n - k + 1) * (m - k + 1), k**2))
row_num = 0
def make_row(x):
return x.flatten()
for i in range(n- k+ 1):
for j in range(m - k + 1):
#collect block of m*m elements and convert to row
xx[row_num,:] = make_row(X[i:i+k, j:j+k])
row_num = row_num + 1
return xx
For more details, see my blog post:
http://www.telesens.co/2018/04/09/initializing-weights-for-the-convolutional-and-fully-connected-layers/
Yes, it is possible and you should also use a doubly block circulant matrix (which is a special case of Toeplitz matrix). I will give you an example with a small size of kernel and the input, but it is possible to construct Toeplitz matrix for any kernel. So you have a 2d input x
and 2d kernel k
and you want to calculate the convolution x * k
. Also let's assume that k
is already flipped. Let's also assume that x
is of size n×n
and k
is m×m
.
So you unroll k
into a sparse matrix of size (n-m+1)^2 × n^2
, and unroll x
into a long vector n^2 × 1
. You compute a multiplication of this sparse matrix with a vector and convert the resulting vector (which will have a size (n-m+1)^2 × 1
) into a n-m+1
square matrix.
I am pretty sure this is hard to understand just from reading. So here is an example for 2×2 kernel and 3×3 input.
*
Here is a constructed matrix with a vector:
which is equal to .
And this is the same result you would have got by doing a sliding window of k
over x
.