natural-language-processing

How can I count word frequencies in Word2Vec's training model?

怎甘沉沦 提交于 2020-06-01 07:04:05
问题 I need to count the frequency of each word in word2vec 's training model. I want to have output that looks like this: term count apple 123004 country 4432180 runs 620102 ... Is it possible to do that? How would I get that data out of word2vec? 回答1: Which word2vec implementation are you using? In the popular gensim library, after a Word2Vec model has its vocabulary established (either by doing its full training, or after build_vocab() has been called), the model's wv property contains a

Getting vector obtained in the last layer of CNN before softmax layer

点点圈 提交于 2019-12-11 06:02:24
问题 I am trying to implement a system by encoding inputs using CNN. After CNN, I need to get a vector and use it in another deep learning method. def get_input_representation(self): # get word vectors from embedding inputs = tf.nn.embedding_lookup(self.embeddings, self.input_placeholder) sequence_length = inputs.shape[1] # 56 vocabulary_size = 160 # 18765 embedding_dim = 256 filter_sizes = [3,4,5] num_filters = 3 drop = 0.5 epochs = 10 batch_size = 30 # this returns a tensor print("Creating Model

Tensorflow.js tokenizer

徘徊边缘 提交于 2019-11-30 21:40:21
I'm new to Machine Learning and Tensorflow, since I don't know python so I decide to use there javascript version (maybe more like a wrapper). The problem is I tried to build a model that process the Natural Language. So the first step is tokenizer the text in order to feed the data to model. I did a lot research, but most of them are using python version of tensorflow that use method like: tf.keras.preprocessing.text.Tokenizer which I can't find similar in tensorflow.js. I'm stuck in this step and don't know how can I transfer text to vector that can feed to model. Please help :) To transform

Tensorflow.js tokenizer

痴心易碎 提交于 2019-11-30 05:41:02
问题 I'm new to Machine Learning and Tensorflow, since I don't know python so I decide to use there javascript version (maybe more like a wrapper). The problem is I tried to build a model that process the Natural Language. So the first step is tokenizer the text in order to feed the data to model. I did a lot research, but most of them are using python version of tensorflow that use method like: tf.keras.preprocessing.text.Tokenizer which I can't find similar in tensorflow.js. I'm stuck in this

What does tf.nn.embedding_lookup function do?

时光毁灭记忆、已成空白 提交于 2019-11-27 16:39:22
tf.nn.embedding_lookup(params, ids, partition_strategy='mod', name=None) I cannot understand the duty of this function. Is it like a lookup table? Which means to return the parameters corresponding to each id (in ids)? For instance, in the skip-gram model if we use tf.nn.embedding_lookup(embeddings, train_inputs) , then for each train_input it finds the correspond embedding? Rafał Józefowicz embedding_lookup function retrieves rows of the params tensor. The behavior is similar to using indexing with arrays in numpy. E.g. matrix = np.random.random([1024, 64]) # 64-dimensional embeddings ids =

What does tf.nn.embedding_lookup function do?

一曲冷凌霜 提交于 2019-11-26 18:43:28
问题 tf.nn.embedding_lookup(params, ids, partition_strategy='mod', name=None) I cannot understand the duty of this function. Is it like a lookup table? Which means to return the parameters corresponding to each id (in ids)? For instance, in the skip-gram model if we use tf.nn.embedding_lookup(embeddings, train_inputs) , then for each train_input it finds the correspond embedding? 回答1: embedding_lookup function retrieves rows of the params tensor. The behavior is similar to using indexing with