发表新帖

发表新帖

Keras initialize large embeddings layer with pretrained embeddings

后端未结

关注

 1  1892

自闭症患者 2021-02-08 13:38

I am trying to re-train a word2vec model in Keras 2 with Tensorflow backend by using pretrained embeddings and custom corpus.

This is how I initialize the embeddings lay

1条回答

自闭症患者 (楼主)

2021-02-08 14:23
Instead of using the embeddings_initializer argument of the Embedding layer you can load pre-trained weights for your embedding layer using the weights argument, this way you should be able to hand over pre-trained embeddings larger than 2GB.

Here is a short example:
```
from keras.layers import Embedding

embedding_layer = Embedding(vocab_size,
                            EMBEDDING_DIM,
                            weights=[embedding_matrix],
                            input_length=MAX_SEQUENCE_LENGTH,
                            trainable=False)
```
Where embedding_matrix is just a regular numpy matrix containing your weights.

For for examples you can also take a look here:
https://blog.keras.io/using-pre-trained-word-embeddings-in-a-keras-model.html

Edit:

As @PavlinMavrodiev (see end of question) pointed out correctly the weights argument is deprecated. He instead used the layer method set_weights to set the weights instead:
- layer.set_weights(weights): sets the weights of the layer from a list of Numpy arrays (with the same shapes as the output of get_weights).
To get trained weights get_weights can be used:
- layer.get_weights(): returns the weights of the layer as a list of Numpy arrays.
Both are methods from the Keras Layer-Baseclass and can be used for all keras layers, including embeddings layer.
0 讨论(0)
发布评论:

提交评论
- 加载中...

热议问题