How to get word2index from gensim

后端 未结 2 806
轻奢々
轻奢々 2021-02-13 22:44

By doc we can use this to read a word2vec model with genism

model = KeyedVectors.load_word2vec_format(\'word2vec.50d.txt\', binary=False)

This

相关标签:
2条回答
  • 2021-02-13 23:13

    The mappings from word-to-index are in the KeyedVectors vocab property, a dictionary with objects that include an index property.

    For example:

    word = "whatever"  # for any word in model
    i = model.vocab[word].index
    model.index2word[i] == word  # will be true
    
    0 讨论(0)
  • 2021-02-13 23:14

    Even simpler solution would be to enumerate index2word

    word2index = {token: token_index for token_index, token in enumerate(w2v.index2word)} 
    word2index['hi'] == 30308  # True
    
    0 讨论(0)
提交回复
热议问题