CNN中，增加Padding过后，为我们带来的那些负面影响。

如上图所示:

第一行为普通3x3卷积，步长1，padding 0,

第二行为普通3x3卷积，步长1，padding 1,

第三行为膨胀3x3卷积,dilation rate=3，步长1，padding 3.

上图箭头右向所指，为cnn底层在caffe 和darknet的底层实现，用c或c++，至于pytorch和tensorflow 是否也是这样实现cnn我不清楚，但是目前来讲，有效实现卷积的也就3种方式，

im2col(上图) ，winograd, FFT，但是还是im2col比较常见，winograd好像是商汤最近几年提出来的，通过特殊数学计算方式，减少计算量，目前该方法被应用在腾讯移动端深度学习框架NCNN中，至于FFT，还没见到用在深度学习种。

至于为什么使用im2col,这还是贾清扬大神在写caffe时提出来的，因为图像中，一个块内的像素点在内存中是地址不连续的，所以对于同一个块内的像素想要缓存到cache上，可能出现多次内存访问，效率极低，所以设计出im2co方式，提前将需要计算的像素点放在连续地址上。

因此，对于同一图像，除了原始图像在内存中占空间，使用im2col又会消耗另一份空间。

如上图所示,对于8x8的图像:

不加padding,计算量为9x36=324, 内存消耗为8x8=64,有效内存为64/64=1

加padding=1,计算量为9x64=572,内存消耗为10x10=100,有效内存为64/100=0.64

加dilation_rate=3,padding=1,计算量为9x64=572,内存消耗为14x14=196，有效内存为64/196=0.32

在上图中可见，添加padding=1就可对内存造成1-0.64=0.36的内存损失，当使用dilation_rate=3时，内存损失为1-0.32=0.68

假如，我们为存储图像分配1个G大小空间，为使用im2col后的图像在分配1个G大小的空间，当我们使用dilation_rate=3之后，有效内存，也就是真正的像素所占的内存仅仅为2x0.32=0.64G。

以上例子为当图像或是特征大小为8x8的情况，假设我们的图像或是特征大小为100x100

那么使用dilation_rate=3，有效内存占比为（100x100）/(106x106)=0.88，有效内存还是挺客观的。

对于8x8的如此小的特征，在我们的网络中一般都出现在网络的深层，而对于100x100的特征，在我们的网络中一般都出现在网络的浅层。因此在网络不同的层中，合理使用dilation，

可以更高效的使用我们的内存。

Implement "same" padding for convolution operations

mimics TensorFlow SAME padding (I'm writing it down into the functional interface, so that nn.Conv2d can just call into F.conv2d_same_padding):

 1 def conv2d_same_padding(input, weight, bias=None, stride=1, dilation=1, groups=1):
 2   input_rows = input.size(2)
 3   filter_rows = weight.size(2)
 4   effective_filter_size_rows = (filter_rows - 1) * dilation[0] + 1
 5   out_rows = (input_rows + stride[0] - 1) // stride[0]
 6   padding_needed =
 7           max(0, (out_rows - 1) * stride[0] + effective_filter_size_rows -
 8                   input_rows)
 9   padding_rows = max(0, (out_rows - 1) * stride[0] +
10                         (filter_rows - 1) * dilation[0] + 1 - input_rows)
11   rows_odd = (padding_rows % 2 != 0)
12   # same for padding_cols
13 
14   if rows_odd or cols_odd:
15     input = F.pad(input, [0, int(cols_odd), 0, int(rows_odd)])
16 
17   return F.conv2d(input, weight, bias, stride,
18                   padding=(padding_rows // 2, padding_cols // 2),
19                   dilation=dilation, groups=groups)
20

It was mostly copy-pasted from TensorFlow code in here and here.

“As you can see, there is a lot of hidden things going on there, and that's why it might not be worth it adding a padding='same'. And I think not replicating the SAME behavior in TensorFlow is not ideal either.

”

本文来自于：

Francisco Massa ： Implement "same" padding for convolution operations?

谢谢！！！　　　　　　　　　　　　　　

转载于:https://www.cnblogs.com/wang2825/articles/8947634.html

来源：oschina

链接：https://my.oschina.net/u/4415723/blog/4752346

标签

tensorflow