Keras Sequential model input layer

前端 未结 2 915
执笔经年
执笔经年 2020-11-27 07:25

When creating a Sequential model in Keras, I understand you provide the input shape in the first layer. Does this input shape then make an implicit input layer?

相关标签:
2条回答
  • 2020-11-27 08:06

    It depends on your perspective :-)

    Rewriting your code in line with more recent Keras tutorial examples, you would probably use:

    model = Sequential()
    model.add(Dense(32, activation='relu', input_dim=784))
    model.add(Dense(10, activation='softmax')
    

    ...which makes it much more explicit that you only have 2 Keras layers. And this is exactly what you do have (in Keras, at least) because the "input layer" is not really a (Keras) layer at all: it's only a place to store a tensor, so it may as well be a tensor itself.

    Each Keras layer is a transformation that outputs a tensor, possibly of a different size/shape to the input. So while there are 3 identifiable tensors here (input, outputs of the two layers), there are only 2 transformations involved corresponding to the 2 Keras layers.

    On the other hand, graphically, you might represent this network with 3 (graphical) layers of nodes, and two sets of lines connecting the layers of nodes. Graphically, it's a 3-layer network. But "layers" in this graphical notation are bunches of circles that sit on a page doing nothing, whereas a layers in Keras transform tensors and do actual work for you. Personally, I would get used to the Keras perspective :-)

    Note finally that for fun and/or simplicity, I substituted input_dim=784 for input_shape=(784,) to avoid the syntax that Python uses to both confuse newcomers and create a 1-D tuple: (<value>,).

    0 讨论(0)
  • 2020-11-27 08:14

    Well, it actually is an implicit input layer indeed, i.e. your model is an example of a "good old" neural net with three layers - input, hidden, and output. This is more explicitly visible in the Keras Functional API (check the example in the docs), in which your model would be written as:

    inputs = Input(shape=(784,))                 # input layer
    x = Dense(32, activation='relu')(inputs)     # hidden layer
    outputs = Dense(10, activation='softmax')(x) # output layer
    
    model = Model(inputs, outputs)
    

    Actually, this implicit input layer is the reason why you have to include an input_shape argument only in the first (explicit) layer of the model in the Sequential API - in subsequent layers, the input shape is inferred from the output of the previous ones (see the comments in the source code of core.py).

    You may also find the documentation on tf.contrib.keras.layers.Input enlightening.

    0 讨论(0)
提交回复
热议问题