How to Train GloVe algorithm on my own corpus

前端 未结 4 1081
孤街浪徒
孤街浪徒 2021-02-04 01:59

I tried to follow this.
But some how I wasted a lot of time ending up with nothing useful.
I just want to train a GloVe model on my own corpus (~900Mb corpu

4条回答
  •  名媛妹妹
    2021-02-04 02:43

    You can do it using GloVe library:

    Install it: pip install glove_python

    Then:

    from glove import Corpus, Glove
    
    #Creating a corpus object
    corpus = Corpus() 
    
    #Training the corpus to generate the co occurence matrix which is used in GloVe
    corpus.fit(lines, window=10)
    
    glove = Glove(no_components=5, learning_rate=0.05) 
    glove.fit(corpus.matrix, epochs=30, no_threads=4, verbose=True)
    glove.add_dictionary(corpus.dictionary)
    glove.save('glove.model')
    

    Reference: word vectorization using glove

提交回复
热议问题