I am working on a text classification problem, that is, given some text, I need to assign to it certain given labels.
I have tried using fast-text library by Facebook, w
FastText's native classification mode depends on you training the word-vectors yourself, using texts with known classes. The word-vectors thus become optimized to be useful for the specific classifications observed during training. So that mode typically wouldn't be used with pre-trained vectors.
If using pre-trained word-vectors, you'd then somehow compose those into a text-vector yourself (for example, by averaging all the words of a text together), then training a separate classifier (such as one of the many options from scikit-learn) using those features.
FastText supervised training has -pretrainedVectors
argument which can be used like this:
$ ./fasttext supervised -input train.txt -output model -epoch 25 \
-wordNgrams 2 -dim 300 -loss hs -thread 7 -minCount 1 \
-lr 1.0 -verbose 2 -pretrainedVectors wiki.ru.vec
Few things to consider:
-dim 300
argument.-loss hs
)