I know what meaning stride has when it is just an integer number (by which step you should apply filter to image). But what about (1, 1)
or even more dimensiona
The stride defines how the filter is moved along the input image (tensor). Nothing stops you from striding along different axes differently, e.g., stride=[1, 2]
means move 1px at a time along 0 axis, and 2px at a time along 1 axis. This particular combination isn't common, but possible.
Tensorflow API goes even further and allows custom striding for all axes of the 4D input tensor (see tf.nn.conv2d). Using this API it's not uncommon to set strides=[1, 2, 2, 1]
, which makes perfect sense: it should process each image (the first 1
) and each input channel (the last 1
), but apply 2x2
striding of the spatial dimensions. As far as the convolution is concerned, the operation is applicable for any strides
array, however not values are equally useful.
Highly recommend this CS231n tutorial for more detail on this.