word-embedding

What does a weighted word embedding mean?

白昼怎懂夜的黑 提交于 2019-12-20 10:08:05
问题 In the paper that I am trying to implement, it says, In this work, tweets were modeled using three types of text representation. The first one is a bag-of-words model weighted by tf-idf (term frequency - inverse document frequency) (Section 2.1.1). The second represents a sentence by averaging the word embeddings of all words (in the sentence) and the third represents a sentence by averaging the weighted word embeddings of all words, the weight of a word is given by tf-idf (Section 2.1.2). I

What does a weighted word embedding mean?

为君一笑 提交于 2019-12-20 10:08:03
问题 In the paper that I am trying to implement, it says, In this work, tweets were modeled using three types of text representation. The first one is a bag-of-words model weighted by tf-idf (term frequency - inverse document frequency) (Section 2.1.1). The second represents a sentence by averaging the word embeddings of all words (in the sentence) and the third represents a sentence by averaging the weighted word embeddings of all words, the weight of a word is given by tf-idf (Section 2.1.2). I

How to get word vectors from Keras Embedding Layer

孤街醉人 提交于 2019-12-18 04:00:38
问题 I'm currently working with a Keras model which has a embedding layer as first layer. In order to visualize the relationships and similarity of words between each other I need a function that returns the mapping of words and vectors of every element in the vocabulary (e.g. 'love' - [0.21, 0.56, ..., 0.65, 0.10]). Is there any way to do it? 回答1: You can get the word embeddings by using the get_weights() method of the embedding layer (i.e. essentially the weights of an embedding layer are the

Entity Embedding of Categorical within Time Series Data and LSTM

放肆的年华 提交于 2019-12-13 03:19:15
问题 I'm trying to solve a time series problem. In short, for each customer and material (SKU code), I have different orders placed in the past. I need to build a model that predict the number of days before the next order for each customer and material. What I'm trying to do is to build an LSTM model in Keras, where for each customer and material I have a 50 max padded timesteps of history, and I'm using a mix of numeric (# of days since previous order, AVG days between orders in last 60 days etc

Issues in Gensim WordRank Embeddings

混江龙づ霸主 提交于 2019-12-11 16:59:37
问题 I am using Gensim wrapper to obtain wordRank embeddings (I am following their tutorial to do this) as follows. from gensim.models.wrappers import Wordrank model = Wordrank.train(wr_path = "models", corpus_file="proc_brown_corp.txt", out_name= "wr_model") model.save("wordrank") model.save_word2vec_format("wordrank_in_word2vec.vec") However, I am getting the following error FileNotFoundError: [WinError 2] The system cannot find the file specified . I am just wondering what I have made wrong as

Using subword information in OOV token from fasttext in word embedding layer (keras/tensorflow)

我们两清 提交于 2019-12-11 10:57:35
问题 I have my own Fasttext model and trained with it a keras classification model with a word embedding layer. But, I wonder how I can make use of the subword information of my model for OOV words? Since the word embedding layer operated via indices to look up word vectors and OOV words have no index. Even if a OOV token has a index how would I assign it the proper word vector to this OOV on the fly for an already trained model? Thanks in advance! 来源: https://stackoverflow.com/questions/56043487

What does the embedding layer for a network looks like?

99封情书 提交于 2019-12-11 07:19:29
问题 I just start with text classification, and I got stuck in the embedding layer. If I have a batch of sequences encoded as integer corresponding to each word, what does the embedding layer looks like? Is there neurons like normal neural layer? I've seen the keras.layers.Embedding , but after looking for the document I'm really confused about how does it works. I can understand input_dim , but why is output_dim a 2D matrix? How many weights do I have in this embedding layer? I'm sorry if my

ValueError: cannot reshape array of size 3800 into shape (1,200)

狂风中的少年 提交于 2019-12-11 06:47:41
问题 I am trying to apply word embedding on tweets. I was trying to create a vector for each tweet by taking the average of the vectors of the words present in the tweet as follow: def word_vector(tokens, size): vec = np.zeros(size).reshape((1, size)) count = 0. for word in tokens: try: vec += model_w2v[word].reshape((1, size)) count += 1. except KeyError: # handling the case where the token is not in vocabulary continue if count != 0: vec /= count return vec Next, when I try to Prepare word2vec

Initializing Out of Vocabulary (OOV) tokens

喜夏-厌秋 提交于 2019-12-11 06:22:44
问题 I am building TensorFlow model for NLP task, and I am using pretrained Glove 300d word-vector/embedding dataset. Obviously some tokens can't be resolved as embeddings, because were not included into training dataset for word vector embedding model, e.g. rare names. I can replace those tokens with vectors of 0s, but rather than dropping this information on the floor, I prefer to encode it somehow and include to my training data. Say, I have 'raijin' word, which can't be resolved as embedding

Multiple embedding layers in keras

有些话、适合烂在心里 提交于 2019-12-11 05:14:02
问题 With pretrained embeddings, we can specify them as weights in keras' embedding layer. To use multiple embeddings, would specifying multiple embedding layer be suitable? i.e. embedding_layer1 = Embedding(len(word_index) + 1, EMBEDDING_DIM, weights=[embedding_matrix_1], input_length=MAX_SEQUENCE_LENGTH, trainable=False) embedding_layer2 = Embedding(len(word_index) + 1, EMBEDDING_DIM, weights=[embedding_matrix_2], input_length=MAX_SEQUENCE_LENGTH, trainable=False) model.add(embedding_layer1)