recurrent-neural-network

How to visualize RNN/LSTM weights in Keras/TensorFlow?

阅读更多关于 How to visualize RNN/LSTM weights in Keras/TensorFlow?

问题 I've come across research publications and Q&A's discussing a need for inspecting RNN weights; some related answers are in the right direction, suggesting get_weights() - but how do I actually visualize the weights meaningfully ? Namely, LSTMs and GRUs have gates , and all RNNs have channels that serve as independent feature extractors - so how do I (1) fetch per-gate weights, and (2) plot them in an informative manner? 回答1: Keras/TF build RNN weights in a well-defined order, which can be

How to switch Off/On an LSTM layer?

阅读更多关于 How to switch Off/On an LSTM layer?

问题 I am looking for a way to access the LSTM layer such that the addition and subtraction of a layer are event-driven. So the Layer can be added or subtracted when there is a function trigger. For Example (hypothetically): Add an LSTM layer if a = 2 and remove an LSTM layer if a = 3. Here a = 2 and a= 3 is supposed to be a python function which returns specific value based on which the LSTM layer should be added or removed. I want to add a switch function to the layer so that it can be switched

Dynamic graphs in tensorflow

阅读更多关于 Dynamic graphs in tensorflow

问题 I would like to implement a 2D LSTM as in this paper, specifically I would like to do so dynamically, so using tf.while. In brief this network works as follows. order the pixels in an image so that pixel i, j -> i * width + j run a 2D-LSTM over this sequence The difference between a 2D and regular LSTM is we have a recurrent connection between the previous element in the sequence and the pixel directly above the current pixel, so at pixel i,j are connections to i - 1, j and i, j - 1. What I

LSTM - Use deltaTime as a feature? How to handle irregular timestamps?

阅读更多关于 LSTM - Use deltaTime as a feature? How to handle irregular timestamps?

问题 I'm trying to create a LSTM for classification of data sequences. The data structure of every training input that I would use is: [[ [deltaX,deltaY,deltaTime], [deltaX,deltaY,deltaTime],... ],class] Where deltaX and deltaY reflect the change of X and Y in a given time deltaTime. deltaTime is not the same everytime, it can vary from 40ms to 50ms to sometimes 1000ms. The 'class' at the end is a binary classification, that can be either 0 or 1. Question 1 (regular LSTM): Should I include

How to visualize RNN/LSTM gradients in Keras/TensorFlow?

阅读更多关于 How to visualize RNN/LSTM gradients in Keras/TensorFlow?

问题 I've come across research publications and Q&A's discussing a need for inspecting RNN gradients per backpropagation through time (BPTT) - i.e., gradient for each timestep . The main use is introspection : how do we know if an RNN is learning long-term dependencies ? A question of its own topic, but the most important insight is gradient flow : If a non-zero gradient flows through every timestep, then every timestep contributes to learning - i.e., resultant gradients stem from accounting for

How to use previous output and hidden states from LSTM for the attention mechanism?

阅读更多关于 How to use previous output and hidden states from LSTM for the attention mechanism?

问题 I am currently trying to code the attention mechanism from this paper: "Effective Approaches to Attention-based Neural Machine Translation", Luong, Pham, Manning (2015). (I use global attention with the dot score). However, I am unsure on how to input the hidden and output states from the lstm decode. The issue is that the input of the lstm decoder at time t depends on quantities that I need to compute using the output and hidden states from t-1. Here is the relevant part of the code: with tf

How to organize the Recurrent Neural Network?

阅读更多关于 How to organize the Recurrent Neural Network?

问题 I want to model the following: y(t)=F(x(t-1),x(t-2),...x(t-k)) or lets say a function that its current output is depended on the last k inputs. 1- I know one way is to have a classic Neural Network with k inputs as {x(t-1),x(t-2),...x(t-k)} for each y(t) and train it. Then what's the benefit of using a RNN to solve that problem? 2- Assuming using RNN, should i use only the x(t) (or x(t-1)) and assume the hidden layer(s) can find the relation of y(t) to the past k inputs through having the in

How to use fit_generator with sequential data that is split into batches?

阅读更多关于 How to use fit_generator with sequential data that is split into batches?

问题 I am trying to write a generator for my Keras lstm model. To use it with fit_generator method. My first question is what should my generator return? A batch? A sequence? Example in Keras documentation returns x,y for each data entry, but what if my data is sequential? And I want to split it into batches? Here is the python method that creates a batch for a given input def get_batch(data, batch_num, batch_size, seq_length): i_start = batch_num*batch_size; batch_sequences = [] batch_labels = []

How to use outputs from previous time steps as input along with other inputs in RNN using tensorflow?

阅读更多关于 How to use outputs from previous time steps as input along with other inputs in RNN using tensorflow?

问题 In the following example, there are three time series and I want to predict another time series y which is a function of the three. How can I use four inputs to predict the time series where the fourth input is the output at previous time step? import tensorflow as tf import numpy as np import pandas as pd #clean computation graph tf.reset_default_graph() tf.set_random_seed(777) # reproducibility np.random.seed(0) import matplotlib.pyplot as plt def MinMaxScaler(data): numerator = data - np

How to generate/read sparse sequence labels for CTC loss within Tensorflow?

阅读更多关于 How to generate/read sparse sequence labels for CTC loss within Tensorflow?

问题 From a list of word images with their transcriptions, I am trying to create and read sparse sequence labels (for tf.nn.ctc_loss ) using a tf.train.slice_input_producer , avoiding serializing pre-packaged training data to disk in TFRecord format the apparent limitations of tf.py_func , any unnecessary or premature padding, and reading the entire data set to RAM. The main issue seems to be converting a string to the sequence of labels (a SparseTensor ) needed for tf.nn.ctc_loss . For example,