问题
I used VGG 16-Layer Caffe model for image captions and I have several captions per image. Now, I want to generate a sentence from those captions (words).
I read in a paper on LSTM that I should remove the SoftMax layer from the training network and provide the 4096 feature vector from fc7
layer directly to LSTM.
I am new to LSTM and RNN stuff.
Where should I begin? Is there any tutorial showing how to generate sentence by sequence labeling?
回答1:
AFAIK the master branch of BVLC/caffe does not yet support a recurrent layer architecture.
You should pull branch recurrent
from jeffdonahue/caffe. This branch supports RNN and LSTM.
It also contains a detailed example on how to generate image captions trained using MS COCO data.
来源:https://stackoverflow.com/questions/34494796/how-to-generate-a-sentence-from-feature-vector-or-words