recurrent-neural-network

Batch-major vs time-major LSTM

阅读更多关于 Batch-major vs time-major LSTM

问题 Do RNNs learn different dependency patterns when the input is batch-major as opposed to time-major? 回答1: (Edit: sorry my initial argument was why it makes sense but I realized that it doesn't so this is a little OT.) I haven't found the TF-groups reasoning behind this but it does does not make computational sense as ops are written in C++. Intuitively, we want to mash up (multiply/add etc) different features from the same sequence on the same timestep. Different timesteps can’t be done in

TensorFlow Embedding Lookup

阅读更多关于 TensorFlow Embedding Lookup

问题 I am trying to learn how to build RNN for Speech Recognition using TensorFlow. As a start, I wanted to try out some example models put up on TensorFlow page TF-RNN As per what was advised, I had taken some time to understand how word IDs are embedded into a dense representation (Vector Representation) by working through the basic version of word2vec model code. I had an understanding of what tf.nn.embedding_lookup actually does, until I actually encountered the same function being used with

What is a “cell class” in Keras?

阅读更多关于 What is a “cell class” in Keras?

问题 Or, more specific: what is the difference between ConvLSTM2D and ConvLSTM2DCell ? What is the difference between SimpleRNN and SimpleRNNCell? Same question for GRU and GRUCell Keras manuals are not very verbose here. I can see from RTFS (reading those fine sources) that these classes are descendants of different base classes. Those, with names, ending with Cell , are subclasses of Layer . In my task I need to classify video sequences. That is, my classifier's input is a sequence of video

What's the difference between tensorflow dynamic_rnn and rnn?

阅读更多关于 What's the difference between tensorflow dynamic_rnn and rnn?

问题 There are several classes in tf.nn that relate to RNNs. In the examples I find on the web, tf.nn.dynamic_rnn and tf.nn.rnn seem to be used interchangeably or at least I cannot seem to figure out why one is used in place of the other. What is the difference? 回答1: From RNNs in Tensorflow, a Practical Guide and Undocumented Features by Denny Britz, published in August 21, 2016. tf.nn.rnn creates an unrolled graph for a fixed RNN length. That means, if you call tf.nn.rnn with inputs having 200

Cyclic computational graphs with Tensorflow or Theano

阅读更多关于 Cyclic computational graphs with Tensorflow or Theano

问题 Both TensorFlow and Theano do not seem to support cyclic computational graphs, cyclic elements are implemented as recurrent cells with buffer and unrolling (RNN / LSTM cells), but this limitation is mostly related with the computation of back-propagation. I don't have a particular need for computing back-propagation but just the forward propagations. Is there a way to ignore this limitation, or perhaps just to break down arbitrary computational graphs in acyclic components? 回答1: TensorFlow

Tensorflow - LSTM state reuse within batch

阅读更多关于 Tensorflow - LSTM state reuse within batch

问题 I am working on a Tensorflow NN which uses an LSTM to track a parameter (time series data regression problem). A batch of training data contains a batch_size of consecutive observations. I would like to use the LSTM state as input to the next sample. So, if I have a batch of data observations, I would like to feed the state of the first observation as input to the second observation and so on. Below I define the lstm state as a tensor of size = batch_size. I would like to reuse the state

How can I complete following GRU based RNN written in tensorflow?

阅读更多关于 How can I complete following GRU based RNN written in tensorflow?

问题 So far I have written following code: import pickle import numpy as np import pandas as pd import tensorflow as tf # load pickled objects (x and y) x_input, y_actual = pickle.load(open('sample_input.pickle', 'rb')) x_input = np.reshape(x_input, (50, 1)) y_actual = np.reshape(y_actual, (50, 1)) # parameters batch_size = 50 hidden_size = 100 # create network graph input_data = tf.placeholder(tf.float32, [batch_size, 1]) output_data = tf.placeholder(tf.float32, [batch_size, 1]) cell = tf.nn.rnn

seq2seq-Attention peeping into the encoder-states bypasses last encoder-hidden-state

阅读更多关于 seq2seq-Attention peeping into the encoder-states bypasses last encoder-hidden-state

问题 In the seq2seq-Model I want to use the hidden state at end of encoding to read out further info from the input sequence. So I return the hidden state and build a new sub net on top of it. That works decently well. However, I have a doubt: This is supposed to become more complex, thus I am effectively relying of having ALL the necessary information for the additional task to be encoded in that hidden state. If, however, the seq2seq-decoder uses the attention mechanism, it basically peeps into

Avoiding duplicating graph in tensorflow (LSTM model)

阅读更多关于 Avoiding duplicating graph in tensorflow (LSTM model)

问题 I have the following simplified code (actually, unrolled LSTM model): def func(a, b): with tf.variable_scope('name'): res = tf.add(a, b) print(res.name) return res func(tf.constant(10), tf.constant(20)) Whenever I run the last line, it seems that it changes the graph. But I don't want the graph changes. Actually my code is different and is a neural network model but it is too huge, so I've added the above code. I want to call the func without changing the graph of model but it changes. I read

Python - Pattern prediction using LSTM Recurrent Neural Networks with Keras

阅读更多关于 Python - Pattern prediction using LSTM Recurrent Neural Networks with Keras

问题 I am dealing with pattern prediction from a formatted CSV dataset with three columns (time_stamp, X and Y - where Y is the actual value). I wanted to predict the value of X from Y based on time index from past values and here is how I approached the problem with LSTM Recurrent Neural Networks in Python with Keras. import numpy as np import pandas as pd import matplotlib.pyplot as plt from keras.models import Sequential from keras.layers import LSTM, Dense from keras.preprocessing.sequence