In spacy, how to use your own word2vec model created in gensim?

前端 未结 2 1533
感情败类
感情败类 2021-02-19 10:37

I have trained my own word2vec model in gensim and I am trying to load that model in spacy. First, I need to save it in my disk and then try to load an init-model in spacy but u

相关标签:
2条回答
  • 2021-02-19 11:11

    Train and save your model in plain-text format:

    from gensim.test.utils import common_texts, get_tmpfile
    from gensim.models import Word2Vec
    
    path = get_tmpfile("./data/word2vec.model")
    
    model = Word2Vec(common_texts, size=100, window=5, min_count=1, workers=4)
    model.wv.save_word2vec_format("./data/word2vec.txt")
    

    Gzip the text file:

    gzip word2vec.txt
    

    Which produces a word2vec.txt.gz file.

    Run the following command:

    python -m spacy init-model en ./data/spacy.word2vec.model --vectors-loc word2vec.txt.gz
    

    Load the vectors using:

    nlp = spacy.load('./data/spacy.word2vec.model/')
    
    0 讨论(0)
  • 2021-02-19 11:22

    As explained here, you can import custom word vectors that trained using Gensim, Fast Text, or Tomas Mikolov's original word2vec implementation, by creating a model using:

    wget https://s3-us-west-1.amazonaws.com/fasttext-vectors/word-vectors-v2/cc.la.300.vec.gz
    python -m spacy init-model en your_model --vectors-loc cc.la.300.vec.gz
    

    then you can load you model, nlp = spacy.load('your_model') and use it!

    Also see the similar question that answered here.

    0 讨论(0)
提交回复
热议问题