问题
I started experimenting with word embeddings, and I found some results which I don't know how to interpret.
I first used an English corpus for both training and testing and afterwards, I used the English corpus for training and a small French corpus for testing (all corpora have been annotated for the same binary classification task). In both cases, I used the pre-trained on tweets Glove embeddings.
As the results in the case where I also used the French corpus improved (by almost 5%, reaching ~accuracy = 0.8), I was wondering if Glove was trained on multilingual data.
I haven't seen anyone making this statement, in contrast to FastText, for example, where you have embeddings for different languages.
来源:https://stackoverflow.com/questions/54746745/glove-word-embeddings-supported-languages