Difference between max length of word ngrams and size of context window

若如初见. 提交于 2020-06-13 08:47:45

问题


In the description of the fasttext library for python https://github.com/facebookresearch/fastText/tree/master/python for training a supervised model there are different arguments, where among others are stated as:

  • ws: size of the context window
  • wordNgrams: max length of word ngram.

If I understand it right, both of them are responsible for taking into account the surrounding words of the word, but what is the clear difference between them?


回答1:


First, we use the train_unsupervised API to create a Word-Representation Model. There are two techniques that we can use, skipgram and cbow. On the other hand, we use the train_supervised API to create Text Classification Model. You are asking about the train_supervised API, so I will stick to it.

The way that text classification work in fasttext, is to first represent the word using skipgram by default. Then, use these word-vectors learned from the skipgram model to classify your input text. The two parameters that you asked about (ws and wordNgrams) are related to the skipgram/cbow model.

The following image contains a simplified illustration of how we are using our input text to train skipgram model. Here, we defined the ws parameter as 2 and wordNgrams as 1.

As we can see, we have only one text in our training data which is The quick brown fox jumps over the lazy dog. We defined the context window to be two, which means that we will create a window whose center is center word and the next/previous two words within the window are target words. Then, we move this window a word at a time. The bigger the window size is, the more training samples you have for your model, the more overfitted the model becomes given a small sample of data.

That's for our first argument ws. According to the second argument wordNgrams, if we set wordNgrams to 2, it will consider two-word pairs like the following image. (The ws in the following image is one for simplicity)

Ref

  • Check this link which contains the source code for the train_supervised method.

  • There is a major difference between skipgram and cbow that can be summarized in the following image:



来源:https://stackoverflow.com/questions/57507056/difference-between-max-length-of-word-ngrams-and-size-of-context-window

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!