问题
I don't understand how word vectors are involved at all in the training process with gensim's doc2vec in DBOW mode (dm=0
). I know that it's disabled by default with dbow_words=0
. But what happens when we set dbow_words
to 1?
In my understanding of DBOW, the context words are predicted directly from the paragraph vectors. So the only parameters of the model are the N
p
-dimensional paragraph vectors plus the parameters of the classifier.
But multiple sources hint that it is possible in DBOW mode to co-train word and doc vectors. For instance:
- section 5 of An Empirical Evaluation of doc2vec with Practical Insights into Document Embedding Generation
- this SO answer: How to use Gensim doc2vec with pre-trained word vectors?
So, how is this done? Any clarification would be much appreciated!
Note: for DM, the paragraph vectors are averaged/concatenated with the word vectors to predict the target words. In that case, it's clear that words vectors are trained simultaneously with document vectors. And there are N*p + M*q + classifier
parameters (where M
is vocab size and q
word vector space dim).
回答1:
If you set dbow_words=1
, then skip-gram word-vector training is added the to training loop, interleaved with the normal PV-DBOW training.
So, for a given target word in a text, 1st the candidate doc-vector is used (alone) to try to predict that word, with backpropagation adjustments then occurring to the model & doc-vector. Then, a bunch of the surrounding words are each used, one at a time in skip-gram fashion, to try to predict that same target word – with the followup adjustments made.
Then, the next target word in the text gets the same PV-DBOW plus skip-gram treatment, and so on, and so on.
As some logical consequences of this:
training takes longer than plain PV-DBOW - by about a factor equal to the
window
parameterword-vectors overall wind up getting more total training attention than doc-vectors, again by a factor equal to the
window
parameter
来源:https://stackoverflow.com/questions/55592142/how-are-word-vectors-co-trained-with-paragraph-vectors-in-doc2vec-dbow