2-D convolution as a matrix-matrix multiplication

前端 未结 4 1110
你的背包
你的背包 2020-11-29 17:07

I know that, in the 1D case, the convolution between two vectors, a and b, can be computed as conv(a, b), but also as the product betw

相关标签:
4条回答
  • 2020-11-29 17:50

    1- Define Input and Filter

    Let I be the input signal and F be the filter or kernel.

    2- Calculate the final output size

    If the I is m1 x n1 and F is m2 x n2 the size of the output will be:

    3- Zero-pad the filter matrix

    Zero pad the filter to make it the same size as the output.

    4- Create Toeplitz matrix for each row of the zero-padded filter

    5- Create a doubly blocked Toeplitz matrix

    Now all these small Toeplitz matrices should be arranged in a big doubly blocked Toeplitz matrix.

    6- Convert the input matrix to a column vector

    7- Multiply doubly blocked toeplitz matrix with vectorized input signal

    This multiplication gives the convolution result.

    8- Last step: reshape the result to a matrix form

    For more details and python code take a look at my github repository:

    Step by step explanation of 2D convolution implemented as matrix multiplication using toeplitz matrices in python

    0 讨论(0)
  • 2020-11-29 17:53

    If you unravel k to a m^2 vector and unroll X, you would then get:

    • a m**2 vectork
    • a ((n-m)**2, m**2) matrix for unrolled_X

    where unrolled_X could be obtained by the following Python code:

    from numpy import zeros
    
    
    def unroll_matrix(X, m):
      flat_X = X.flatten()
      n = X.shape[0]
      unrolled_X = zeros(((n - m) ** 2, m**2))
      skipped = 0
      for i in range(n ** 2):
          if (i % n) < n - m and ((i / n) % n) < n - m:
              for j in range(m):
                  for l in range(m):
                      unrolled_X[i - skipped, j * m + l] = flat_X[i + j * n + l]
          else:
              skipped += 1
      return unrolled_X
    

    Unrolling X and not k allows a more compact representation (smaller matrices) than the other way around for each X - but you need to unroll each X. You could prefer unrolling k depending on what you want to do.

    Here, the unrolled_X is not sparse, whereas unrolled_k would be sparse, but of size ((n-m+1)^2,n^2) as @Salvador Dali mentioned.

    Unrolling k could be done like this:

    from scipy.sparse import lil_matrix
    from numpy import zeros
    import scipy 
    
    
    def unroll_kernel(kernel, n, sparse=True):
    
        m = kernel.shape[0]
        if sparse:
             unrolled_K = lil_matrix(((n - m)**2, n**2))
        else:
             unrolled_K = zeros(((n - m)**2, n**2))
    
        skipped = 0
        for i in range(n ** 2):
             if (i % n) < n - m and((i / n) % n) < n - m:
                 for j in range(m):
                     for l in range(m):
                        unrolled_K[i - skipped, i + j * n + l] = kernel[j, l]
             else:
                 skipped += 1
        return unrolled_K
    
    0 讨论(0)
  • 2020-11-29 17:58

    The code shown above doesn't produce the unrolled matrix of the right dimensions. The dimension should be (n-k+1)*(m-k+1), (k)(k). k: filter dimension, n: num rows in input matrix, m: num columns.

    def unfold_matrix(X, k):
        n, m = X.shape[0:2]
        xx = zeros(((n - k + 1) * (m - k + 1), k**2))
        row_num = 0
        def make_row(x):
            return x.flatten()
    
        for i in range(n- k+ 1):
            for j in range(m - k + 1):
                #collect block of m*m elements and convert to row
                xx[row_num,:] = make_row(X[i:i+k, j:j+k])
                row_num = row_num + 1
    
        return xx
    

    For more details, see my blog post:

    http://www.telesens.co/2018/04/09/initializing-weights-for-the-convolutional-and-fully-connected-layers/

    0 讨论(0)
  • 2020-11-29 18:05

    Yes, it is possible and you should also use a doubly block circulant matrix (which is a special case of Toeplitz matrix). I will give you an example with a small size of kernel and the input, but it is possible to construct Toeplitz matrix for any kernel. So you have a 2d input x and 2d kernel k and you want to calculate the convolution x * k. Also let's assume that k is already flipped. Let's also assume that x is of size n×n and k is m×m.

    So you unroll k into a sparse matrix of size (n-m+1)^2 × n^2, and unroll x into a long vector n^2 × 1. You compute a multiplication of this sparse matrix with a vector and convert the resulting vector (which will have a size (n-m+1)^2 × 1) into a n-m+1 square matrix.

    I am pretty sure this is hard to understand just from reading. So here is an example for 2×2 kernel and 3×3 input.

    *

    Here is a constructed matrix with a vector:

    which is equal to .

    And this is the same result you would have got by doing a sliding window of k over x.

    0 讨论(0)
提交回复
热议问题