I am trying to initialize a tensorflow Variable
with pre-trained word2vec
embeddings.
I have the following code:
import ten
try this:
import tensorflow as tf
from gensim import models
model = models.KeyedVectors.load_word2vec_format('./GoogleNews-vectors-negative300.bin', binary=True)
X = model.syn0
embeddings = tf.Variable(tf.random_uniform(X.shape, minval=-0.1, maxval=0.1), trainable=False)
sess = tf.Session()
sess.run(tf.global_variables_initializer())
embeddings.load(model.syn0, sess)
It seems like the only option is to use a placeholder. The cleanest way I can find is to initialize to a placeholder directly:
X_init = tf.placeholder(tf.float32, shape=(3000000, 300))
X = tf.Variable(X_init)
# The rest of the setup...
sess.run(tf.initialize_all_variables(), feed_dict={X_init: model.syn0})
The easiest solution is to feed_dict'ing it into a placeholder node that you use to tf.assign to the variable.
X = tf.Variable([0.0])
place = tf.placeholder(tf.float32, shape=(3000000, 300))
set_x = X.assign(place)
# set up your session here....
sess.run(set_x, feed_dict={place: model.syn0})
As Joshua Little noted in a separate answer, you can also use it in the initializer:
X = tf.Variable(place) # place as defined above
...
init = tf.initialize_all_variables()
... create sess ...
sess.run(init, feed_dict={place: model.syn0})