Reshaping Keras layers

前端未结

关注

 2  905

I have an input image 416x416. How can I create an output of 4 x 10, where 4 is number of columns and 10 the number of rows?

My label data is 2D array with 4 columns and

相关标签:

2条回答

孤独总比滥情好

2021-02-08 03:00
First flatten the (None, 13, 13, 1024) layer
```
model.add(Flatten())
```
it will give 13*13*1024=173056

1 dimensional tensor

Then add a dense layer

model.add(Dense(4*10)) it will output to 40

this will transform your 3D shape to 1D

then simply resize to your needs

model.add(Reshape(4,10))

This will work but will absolutely destroy the spatial nature of your data
0 讨论(0)
发布评论:

提交评论
- 加载中...

误落风尘

2021-02-08 03:07

I believe the easiest way to conform your predictions shape with the desired output is the solution proposed by @Darlyn. Assuming the network you have so far was declared (that outputs tensors of shape (13, 13, 1024)) as this:

x = Input(shape=(416, 416, 3))
y = Conv2D(32, activation='relu')(x)
...
y = Conv2D(1024, activation='relu')(y)

You just need to add a regression layer that will try to predict the boxes, and then reshape these to (10, 4):

from keras.layers import Flatten, Dense, Reshape

samples = 1
boxes = 10

y = Flatten(name='flatten')(model.outputs)
y = Dense(boxes * 4, activation='relu')(y)
y = Reshape((boxes, 4), name='predictions')(y)
model = Model(inputs=model.inputs, outputs=y)

x_train = np.random.randn(samples, 416, 416, 3)

p = model.predict(x_train)
print(p.shape)

(1, 10, 4)

This works, but I'm not entire secure that directly regressing these values will produce good results. I usually see object-detection models using attention, region or saliency to determine the position of objects. There are a couple of object-detection keras implementations you could try:

keras-rcnn

classes = ["dog", "cat", "hooman"]

backbone = keras_rcnn.models.backbone.VGG16
model = keras_rcnn.models.RCNN((416, 416, 3), classes, backbone)
boxes, predictions = model.predict(x)

keras-retinanet

from keras_retinanet.models.resnet import resnet_retinanet

x = Input(shape=(416, 416, 3))
model = resnet_retinanet(len(classes), inputs=x)
_, _, boxes, _ = model.predict_on_batch(inputs)

0 讨论(0)