I am learning convolutional neural network with Tensorflow.
I have some doubts regarding tf.nn.conv2d. One of its parameters is filter
:
The filter
argument to the tf.nn.conv2d function, as you quoted, is a 4D tensor of dimensions [filter_height, filter_width, in_channels, out_channels]
. This tensor represents a stack of out_channels
filters of dimension filter_height x filter_width
, to be applied over an image with in_channels
channels.
The parameters, filter_height
, filter_width
and out_channels
are defined by you, whereas input_channels
is dependent on your input to tf.nn.conv2d
.
In other words, a filter tensor with dimensions [2, 2, 1, 5]
, represents 5
different 2 x 2
filters to be applied over a 1
-channel input, but you could perfectly change it to [2, 2, 1, 7]
, or whatever else gives you better results.
To further illustrate, in the following gif you have a [3, 3, 1, 1]
tensor filter convolving over a [1, 5, 5, 1]
image. This means you have only 1
filter being convolved over the image.
GIF source