CNN 结构
构建CNN的结构参考了下面的图:
图片来自于论文:
T. Klamt and S. Behnke, “Towards Learning Abstract Representations for Locomotion Planning in High-dimensional State Spaces,” 2019 International Conference on Robotics and Automation (ICRA), Montreal, QC, Canada, 2019, pp. 922-928.
doi: 10.1109/ICRA.2019.8794144
Tensorflow 代码:
from __future__ import absolute_import, division, print_function, unicode_literals
import tensorflow as tf
from tensorflow.keras import datasets, layers, models
import matplotlib.pyplot as plt
# import the figure
model = models.Sequential()
# model.add(layers.Conv2D(32, (3, 3), activation='relu', input_shape=(32, 32, 3) ))
# model.add(layers.Conv2D(input_shape=(x_train.shape[1], x_train.shape[2], x_train.shape[3]),
# filters=32, kernel_size=(3,3), strides=(1,1), padding='valid',
# activation='relu'))
# (1)CONV layer, f*f =3*3, padding = 1, stride = 1
model.add(layers.Conv2D(input_shape=(72, 72, 1),
filters=3, kernel_size=(3,3),strides=1,padding='same',
activation='relu'))
# (2)CONV layer, f*f = 7*7, padding = 3=(f-1)/2, stride = 1
model.add(layers.Conv2D(filters=5, kernel_size=(7,7),strides=(1,1),padding='same',
activation='relu'))
# (3)CONV layer, f*f = 14*14, padding =0, stride = 1
model.add(layers.Conv2D(filters=28, kernel_size=(14,14),strides=(1,1),padding='valid',
activation='relu'))
# (4)CONV layer, f*f = 4*4, padding = 0, stride =1, max pooling, f=2,s=2
model.add(layers.Conv2D(filters=31, kernel_size=(4,4),strides=(1,1),padding='valid',
activation='relu'))
# model.add(layers.MaxPool2D(pool_size(2, 2)))
model.add(layers.MaxPool2D((2, 2)))
# (5)CONV layer, f*f = 3*3, padding = 0, stride=1, maxpooling
model.add(layers.Conv2D(filters=34, kernel_size=(3,3),strides=(1,1),padding='valid',
activation='relu'))
model.add(layers.MaxPool2D((2, 2)))
# # (6) CONV layer, f*f = 3*3, p=0, s=1
model.add(layers.Conv2D(filters=36, kernel_size=(3,3),strides=(1,1),padding='valid',
activation='relu'))
# # (7)CONV layer, f*f = 3*3, p=0,s=1
model.add(layers.Conv2D(filters=38, kernel_size=(3,3),strides=(1,1),padding='valid',
activation='relu'))
# # (8)CONV layer, f*f = 3*3, p=0, s=1
model.add(layers.Conv2D(filters=40, kernel_size=(3,3),strides=(1,1),padding='valid',
activation='relu'))
#
# # (n)Dense layer
# #To complete our model, you will feed the last output tensor
# # from the convolutional base (of shape (3, 3, 64)) into one or more Dense layers to perform classification.
# #Dense layers take vectors as input (which are 1D), while the current output is a 3D tensor.
# #First, you will flatten (or unroll) the 3D output to 1D, then add one or more Dense layers on top.
# #CIFAR has 10 output classes, so you use a final Dense layer with 10 outputs and a softmax activation.
model.add(layers.Flatten())
model.add(layers.Dense(500, activation='relu')) # this should equal the filters number for last layer.
model.add(layers.Dense(20, activation='softmax')) # 20: the number of outputs
#
model.summary()
关于格式,可以参考tensorflow_core/python/keras/layers/convolutional.py 文件, 里面关于Conv2 的说明如下:
@keras_export('keras.layers.Conv2D', 'keras.layers.Convolution2D')
class Conv2D(Conv):
"""2D convolution layer (e.g. spatial convolution over images).
This layer creates a convolution kernel that is convolved
with the layer input to produce a tensor of
outputs. If `use_bias` is True,
a bias vector is created and added to the outputs. Finally, if
`activation` is not `None`, it is applied to the outputs as well.
When using this layer as the first layer in a model,
provide the keyword argument `input_shape`
(tuple of integers, does not include the sample axis),
e.g. `input_shape=(128, 128, 3)` for 128x128 RGB pictures
in `data_format="channels_last"`.
Arguments:
filters: Integer, the dimensionality of the output space
(i.e. the number of output filters in the convolution).
kernel_size: An integer or tuple/list of 2 integers, specifying the
height and width of the 2D convolution window.
Can be a single integer to specify the same value for
all spatial dimensions.
strides: An integer or tuple/list of 2 integers,
specifying the strides of the convolution along the height and width.
Can be a single integer to specify the same value for
all spatial dimensions.
Specifying any stride value != 1 is incompatible with specifying
any `dilation_rate` value != 1.
padding: one of `"valid"` or `"same"` (case-insensitive).
data_format: A string,
one of `channels_last` (default) or `channels_first`.
The ordering of the dimensions in the inputs.
`channels_last` corresponds to inputs with shape
`(batch, height, width, channels)` while `channels_first`
corresponds to inputs with shape
`(batch, channels, height, width)`.
It defaults to the `image_data_format` value found in your
Keras config file at `~/.keras/keras.json`.
If you never set it, then it will be "channels_last".
dilation_rate: an integer or tuple/list of 2 integers, specifying
the dilation rate to use for dilated convolution.
Can be a single integer to specify the same value for
all spatial dimensions.
Currently, specifying any `dilation_rate` value != 1 is
incompatible with specifying any stride value != 1.
activation: Activation function to use.
If you don't specify anything, no activation is applied
(ie. "linear" activation: `a(x) = x`).
use_bias: Boolean, whether the layer uses a bias vector.
kernel_initializer: Initializer for the `kernel` weights matrix.
bias_initializer: Initializer for the bias vector.
kernel_regularizer: Regularizer function applied to
the `kernel` weights matrix.
bias_regularizer: Regularizer function applied to the bias vector.
activity_regularizer: Regularizer function applied to
the output of the layer (its "activation")..
kernel_constraint: Constraint function applied to the kernel matrix.
bias_constraint: Constraint function applied to the bias vector.
Input shape:
4D tensor with shape:
`(samples, channels, rows, cols)` if data_format='channels_first'
or 4D tensor with shape:
`(samples, rows, cols, channels)` if data_format='channels_last'.
Output shape:
4D tensor with shape:
`(samples, filters, new_rows, new_cols)` if data_format='channels_first'
or 4D tensor with shape:
`(samples, new_rows, new_cols, filters)` if data_format='channels_last'.
`rows` and `cols` values might have changed due to padding.
"""
def __init__(self,
filters,
kernel_size,
strides=(1, 1),
padding='valid',
data_format=None,
dilation_rate=(1, 1),
activation=None,
use_bias=True,
kernel_initializer='glorot_uniform',
bias_initializer='zeros',
kernel_regularizer=None,
bias_regularizer=None,
activity_regularizer=None,
kernel_constraint=None,
bias_constraint=None,
**kwargs):
super(Conv2D, self).__init__(
rank=2,
filters=filters,
kernel_size=kernel_size,
strides=strides,
padding=padding,
data_format=data_format,
dilation_rate=dilation_rate,
activation=activations.get(activation),
use_bias=use_bias,
kernel_initializer=initializers.get(kernel_initializer),
bias_initializer=initializers.get(bias_initializer),
kernel_regularizer=regularizers.get(kernel_regularizer),
bias_regularizer=regularizers.get(bias_regularizer),
activity_regularizer=regularizers.get(activity_regularizer),
kernel_constraint=constraints.get(kernel_constraint),
bias_constraint=constraints.get(bias_constraint),
**kwargs)
运行结果
Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d (Conv2D) (None, 72, 72, 3) 30
_________________________________________________________________
conv2d_1 (Conv2D) (None, 72, 72, 5) 740
_________________________________________________________________
conv2d_2 (Conv2D) (None, 59, 59, 28) 27468
_________________________________________________________________
conv2d_3 (Conv2D) (None, 56, 56, 31) 13919
_________________________________________________________________
max_pooling2d (MaxPooling2D) (None, 28, 28, 31) 0
_________________________________________________________________
conv2d_4 (Conv2D) (None, 26, 26, 34) 9520
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 13, 13, 34) 0
_________________________________________________________________
conv2d_5 (Conv2D) (None, 11, 11, 36) 11052
_________________________________________________________________
conv2d_6 (Conv2D) (None, 9, 9, 38) 12350
_________________________________________________________________
conv2d_7 (Conv2D) (None, 7, 7, 40) 13720
_________________________________________________________________
flatten (Flatten) (None, 1960) 0
_________________________________________________________________
dense (Dense) (None, 500) 980500
_________________________________________________________________
dense_1 (Dense) (None, 20) 10020
=================================================================
Total params: 1,079,319
Trainable params: 1,079,319
Non-trainable params: 0
_________________________________________________________________
Process finished with exit code 0
来源:CSDN
作者:包子味月饼66
链接:https://blog.csdn.net/weixin_41135864/article/details/104253368