Reshaping Keras layers

前端未结

关注

 2  909

灰色年华 2021-02-08 02:43

I have an input image 416x416. How can I create an output of 4 x 10, where 4 is number of columns and 10 the number of rows?

My label data is 2D array with 4 columns and

2条回答

误落风尘 (楼主)

2021-02-08 03:07

I believe the easiest way to conform your predictions shape with the desired output is the solution proposed by @Darlyn. Assuming the network you have so far was declared (that outputs tensors of shape (13, 13, 1024)) as this:

x = Input(shape=(416, 416, 3))
y = Conv2D(32, activation='relu')(x)
...
y = Conv2D(1024, activation='relu')(y)

You just need to add a regression layer that will try to predict the boxes, and then reshape these to (10, 4):

from keras.layers import Flatten, Dense, Reshape

samples = 1
boxes = 10

y = Flatten(name='flatten')(model.outputs)
y = Dense(boxes * 4, activation='relu')(y)
y = Reshape((boxes, 4), name='predictions')(y)
model = Model(inputs=model.inputs, outputs=y)

x_train = np.random.randn(samples, 416, 416, 3)

p = model.predict(x_train)
print(p.shape)

(1, 10, 4)

This works, but I'm not entire secure that directly regressing these values will produce good results. I usually see object-detection models using attention, region or saliency to determine the position of objects. There are a couple of object-detection keras implementations you could try:

keras-rcnn

classes = ["dog", "cat", "hooman"]

backbone = keras_rcnn.models.backbone.VGG16
model = keras_rcnn.models.RCNN((416, 416, 3), classes, backbone)
boxes, predictions = model.predict(x)

keras-retinanet

from keras_retinanet.models.resnet import resnet_retinanet

x = Input(shape=(416, 416, 3))
model = resnet_retinanet(len(classes), inputs=x)
_, _, boxes, _ = model.predict_on_batch(inputs)

0 讨论(0)

查看其它2个回答