Reshaping Keras layers

前端 未结 2 902
灰色年华
灰色年华 2021-02-08 02:43

I have an input image 416x416. How can I create an output of 4 x 10, where 4 is number of columns and 10 the number of rows?

My label data is 2D array with 4 columns and

2条回答
  •  误落风尘
    2021-02-08 03:07

    I believe the easiest way to conform your predictions shape with the desired output is the solution proposed by @Darlyn. Assuming the network you have so far was declared (that outputs tensors of shape (13, 13, 1024)) as this:

    x = Input(shape=(416, 416, 3))
    y = Conv2D(32, activation='relu')(x)
    ...
    y = Conv2D(1024, activation='relu')(y)
    

    You just need to add a regression layer that will try to predict the boxes, and then reshape these to (10, 4):

    from keras.layers import Flatten, Dense, Reshape
    
    samples = 1
    boxes = 10
    
    y = Flatten(name='flatten')(model.outputs)
    y = Dense(boxes * 4, activation='relu')(y)
    y = Reshape((boxes, 4), name='predictions')(y)
    model = Model(inputs=model.inputs, outputs=y)
    
    x_train = np.random.randn(samples, 416, 416, 3)
    
    p = model.predict(x_train)
    print(p.shape)
    

    (1, 10, 4)

    This works, but I'm not entire secure that directly regressing these values will produce good results. I usually see object-detection models using attention, region or saliency to determine the position of objects. There are a couple of object-detection keras implementations you could try:

    keras-rcnn

    classes = ["dog", "cat", "hooman"]
    
    backbone = keras_rcnn.models.backbone.VGG16
    model = keras_rcnn.models.RCNN((416, 416, 3), classes, backbone)
    boxes, predictions = model.predict(x)
    

    keras-retinanet

    from keras_retinanet.models.resnet import resnet_retinanet
    
    x = Input(shape=(416, 416, 3))
    model = resnet_retinanet(len(classes), inputs=x)
    _, _, boxes, _ = model.predict_on_batch(inputs)
    

提交回复
热议问题