How are word vectors co-trained with paragraph vectors in doc2vec DBOW?

。_饼干妹妹 提交于 2019-12-13 19:29:02

问题


I don't understand how word vectors are involved at all in the training process with gensim's doc2vec in DBOW mode (dm=0). I know that it's disabled by default with dbow_words=0. But what happens when we set dbow_words to 1?

In my understanding of DBOW, the context words are predicted directly from the paragraph vectors. So the only parameters of the model are the N p-dimensional paragraph vectors plus the parameters of the classifier.

But multiple sources hint that it is possible in DBOW mode to co-train word and doc vectors. For instance:

  • section 5 of An Empirical Evaluation of doc2vec with Practical Insights into Document Embedding Generation
  • this SO answer: How to use Gensim doc2vec with pre-trained word vectors?

So, how is this done? Any clarification would be much appreciated!

Note: for DM, the paragraph vectors are averaged/concatenated with the word vectors to predict the target words. In that case, it's clear that words vectors are trained simultaneously with document vectors. And there are N*p + M*q + classifier parameters (where M is vocab size and q word vector space dim).


回答1:


If you set dbow_words=1, then skip-gram word-vector training is added the to training loop, interleaved with the normal PV-DBOW training.

So, for a given target word in a text, 1st the candidate doc-vector is used (alone) to try to predict that word, with backpropagation adjustments then occurring to the model & doc-vector. Then, a bunch of the surrounding words are each used, one at a time in skip-gram fashion, to try to predict that same target word – with the followup adjustments made.

Then, the next target word in the text gets the same PV-DBOW plus skip-gram treatment, and so on, and so on.

As some logical consequences of this:

  • training takes longer than plain PV-DBOW - by about a factor equal to the window parameter

  • word-vectors overall wind up getting more total training attention than doc-vectors, again by a factor equal to the window parameter



来源:https://stackoverflow.com/questions/55592142/how-are-word-vectors-co-trained-with-paragraph-vectors-in-doc2vec-dbow

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!