Reshaping Keras layers

前端 未结 2 901
灰色年华
灰色年华 2021-02-08 02:43

I have an input image 416x416. How can I create an output of 4 x 10, where 4 is number of columns and 10 the number of rows?

My label data is 2D array with 4 columns and

相关标签:
2条回答
  • 2021-02-08 03:00

    First flatten the (None, 13, 13, 1024) layer

    model.add(Flatten())
    

    it will give 13*13*1024=173056

    1 dimensional tensor

    Then add a dense layer

    model.add(Dense(4*10)) it will output to 40

    this will transform your 3D shape to 1D

    then simply resize to your needs

    model.add(Reshape(4,10))

    This will work but will absolutely destroy the spatial nature of your data

    0 讨论(0)
  • 2021-02-08 03:07

    I believe the easiest way to conform your predictions shape with the desired output is the solution proposed by @Darlyn. Assuming the network you have so far was declared (that outputs tensors of shape (13, 13, 1024)) as this:

    x = Input(shape=(416, 416, 3))
    y = Conv2D(32, activation='relu')(x)
    ...
    y = Conv2D(1024, activation='relu')(y)
    

    You just need to add a regression layer that will try to predict the boxes, and then reshape these to (10, 4):

    from keras.layers import Flatten, Dense, Reshape
    
    samples = 1
    boxes = 10
    
    y = Flatten(name='flatten')(model.outputs)
    y = Dense(boxes * 4, activation='relu')(y)
    y = Reshape((boxes, 4), name='predictions')(y)
    model = Model(inputs=model.inputs, outputs=y)
    
    x_train = np.random.randn(samples, 416, 416, 3)
    
    p = model.predict(x_train)
    print(p.shape)
    

    (1, 10, 4)

    This works, but I'm not entire secure that directly regressing these values will produce good results. I usually see object-detection models using attention, region or saliency to determine the position of objects. There are a couple of object-detection keras implementations you could try:

    keras-rcnn

    classes = ["dog", "cat", "hooman"]
    
    backbone = keras_rcnn.models.backbone.VGG16
    model = keras_rcnn.models.RCNN((416, 416, 3), classes, backbone)
    boxes, predictions = model.predict(x)
    

    keras-retinanet

    from keras_retinanet.models.resnet import resnet_retinanet
    
    x = Input(shape=(416, 416, 3))
    model = resnet_retinanet(len(classes), inputs=x)
    _, _, boxes, _ = model.predict_on_batch(inputs)
    
    0 讨论(0)
提交回复
热议问题