gensim word2vec accessing in/out vectors

后端未结

关注

 4  665

In the word2vec model, there are two linear transforms that take a word in vocab space to a hidden layer (the \"in\" vector), and then back to the vocab space (the \"out\" vecto

相关标签:

4条回答

旧巷少年郎

2021-02-07 09:10
In the word2vec.py file you need to make this change In the following function it currently returns the "in" vector. As you want the "out" vector. The "in" is saved in syn0 object and "out" is saved in syn1neg object variable.
```
def save_word2vec_format(self, fname, fvocab=None, binary=False):
  ....
  ....
  row = self.syn1neg[vocab.index]
```
0 讨论(0)
发布评论:

提交评论
- 加载中...
自闭症患者

2021-02-07 09:16
Below code will enable to save/load model. It uses pickle internally, optionally mmap‘ing the model’s internal large NumPy matrices into virtual memory directly from disk files, for inter-process memory sharing.
```
model.save('/tmp/mymodel.model')
new_model = gensim.models.Word2Vec.load('/tmp/mymodel')
```
Some background information Gensim is a free Python library designed to process raw, unstructured digital texts (“plain text”). The algorithms in gensim, such as Latent Semantic Analysis, Latent Dirichlet Allocation and Random Projections discover semantic structure of documents by examining statistical co-occurrence patterns of the words within a corpus of training documents.

Some good blog describing about the use and sample code base to kick start on the project
- http://mccormickml.com/2016/04/12/googles-pretrained-word2vec-model-in-python/
- https://rare-technologies.com/making-sense-of-word2vec/
- https://rare-technologies.com/word2vec-tutorial/
- https://rare-technologies.com/deep-learning-with-word2vec-and-gensim/
Installation reference here

Hope this helps!!!
0 讨论(0)
发布评论:

提交评论
- 加载中...
面向向阳花

2021-02-07 09:21
To get the syn1 of any word, this might work.
```
model.syn1[model.wv.vocab['potato'].point]
```
where model is your trained word2vec model.
0 讨论(0)
发布评论:

提交评论
- 加载中...
面向向阳花

2021-02-07 09:30

While this might not be a proper answer (can't comment yet) and noone pointed this out, take a look here. The creator seems to answer a similar question. Also that's the place where you have a higher chance for a valid answer.

Digging around in the link he posted in the word2vec source code you could change the syn1 deletion to suit your needs. Just remember to delete it after you're done, since it proves to be a memory hog.

0 讨论(0)
发布评论:

提交评论
- 加载中...