Difference between max length of word ngrams and size of context window

问题

In the description of the fasttext library for python https://github.com/facebookresearch/fastText/tree/master/python for training a supervised model there are different arguments, where among others are stated as:

ws: size of the context window
wordNgrams: max length of word ngram.

If I understand it right, both of them are responsible for taking into account the surrounding words of the word, but what is the clear difference between them?

回答1:

First, we use the train_unsupervised API to create a Word-Representation Model. There are two techniques that we can use, skipgram and cbow. On the other hand, we use the train_supervised API to create Text Classification Model. You are asking about the train_supervised API, so I will stick to it.

The way that text classification work in fasttext, is to first represent the word using skipgram by default. Then, use these word-vectors learned from the skipgram model to classify your input text. The two parameters that you asked about (ws and wordNgrams) are related to the skipgram/cbow model.

The following image contains a simplified illustration of how we are using our input text to train skipgram model. Here, we defined the ws parameter as 2 and wordNgrams as 1.

As we can see, we have only one text in our training data which is The quick brown fox jumps over the lazy dog. We defined the context window to be two, which means that we will create a window whose center is center word and the next/previous two words within the window are target words. Then, we move this window a word at a time. The bigger the window size is, the more training samples you have for your model, the more overfitted the model becomes given a small sample of data.

That's for our first argument ws. According to the second argument wordNgrams, if we set wordNgrams to 2, it will consider two-word pairs like the following image. (The ws in the following image is one for simplicity)

Ref

Check this link which contains the source code for the train_supervised method.
There is a major difference between skipgram and cbow that can be summarized in the following image:

来源：https://stackoverflow.com/questions/57507056/difference-between-max-length-of-word-ngrams-and-size-of-context-window

标签

python

nlp

fasttext