What will be the output size, if the input to convolution layer of neural network is an image of size 128X128X3 and 40 filters of size 5X5 are applied to it?
Formula : n[i]=(n[i-1]−f[i]+2p[i])/s[i]+1
where,
n[i-1]=128
f[i]=5
p[i]=0
s[i]=1
so,
n[i]=(128-5+0)/1+1 =124
so the size of the output layer is: 124x124x40 Where '40' is the number of filters
(124*124*3)*40 = 1845120 width = 124 height = 124 depth = 3 no. of filters = 40 stride = 1 padding = 0
you can use this formula [(W−K+2P)/S]+1
.
So, we input into the formula:
Output_Shape = (128-5+0)/1+1
Output_Shape = (124,124,40)
NOTE: Stride defaults to 1 if not provided and the 40
in (124, 124, 40)
is the number of filters provided by the user.
Let me start simple; since you have square matrices for both input and filter let me get one dimension. Then you can apply the same for other dimension(s). Imagine your are building fences between trees, if there are N trees, you have to build N-1 fences. Now apply that analogy to convolution layers.
Your output size will be: input size - filter size + 1
Because your filter can only have n-1 steps as fences I mentioned.
Let's calculate your output with that idea. 128 - 5 + 1 = 124 Same for other dimension too. So now you have a 124 x 124 image.
That is for one filter.
If you apply this 40 times you will have another dimension: 124 x 124 x 40
Here is a great guide if you want to know more about advanced convolution arithmetic: https://arxiv.org/pdf/1603.07285.pdf
Best resource:
Dimension calculation
Parameters in CNN