Simple example of CuDnnGRU based RNN implementation in Tensorflow

后端未结

关注

 1  591

轮回少年 2021-01-02 20:54

I am using the following code for standard GRU implementation:

def BiRNN_deep_dynamic_FAST_FULL_autolength(x,batch_size,dropout,hidden_dim):

seq_len=length_


      
      
        
          1条回答        

        
                    
            
            
                         
                
              
              
                
                   小鲜肉
                                             
                
                
                (楼主)
            
              
              
                2021-01-02 21:36
              

            
            
                        
CudnnGRU is not an RNNCell instance. It's more akin to dynamic_rnn.

The tensor manipulations below are equivalent, where input_tensor is a time-major tensor, i.e. of shape [max_sequence_length, batch_size, embedding_size]. CudnnGRU expects the input tensor to be time-major (as opposed to the more standard batch-major format i.e. of shape [batch_size, max_sequence_length, embedding_size]), and it's a good practice to use time-major tensors with RNN ops anyways since they're somewhat faster.

CudnnGRU:

rnn = tf.contrib.cudnn_rnn.CudnnGRU(
  num_rnn_layers, hidden_size, direction='bidirectional')

rnn_output = rnn(input_tensor)


CudnnCompatibleGRUCell:

rnn_output = input_tensor
sequence_length = tf.reduce_sum(
  tf.sign(inputs),
  reduction_indices=0)  # 1 if `input_tensor` is batch-major.

  for _ in range(num_rnn_layers):
    fw_cell = tf.contrib.cudnn_rnn.CudnnCompatibleGRUCell(hidden_size)
    bw_cell = tf.contrib.cudnn_rnn.CudnnCompatibleGRUCell(hidden_size)
    rnn_output = tf.nn.bidirectional_dynamic_rnn(
      fw_cell, bw_cell, rnn_output, sequence_length=sequence_length,
      dtype=tf.float32, time_major=True)[1]  # Set `time_major` accordingly


Note the following:


If you were using LSTMs, you need not use CudnnCompatibleLSTMCell; you can use the standard LSTMCell. But with GRUs, the Cudnn implementation has inherently different math operations, and in particular, more weights (see the documentation).
Unlike dynamic_rnn, CudnnGRU doesn't allow you to specify sequence lengths. Still, it is over an order of magnitude faster, but you will have to be careful on how you extract your outputs (e.g. if you're interested in the final hidden state of each sequence that is padded and of varying length, you will need each sequence's length).
rnn_output is probably a tuple with lots of (distinct) stuff in both cases. Refer to the documentation, or just print it out, to inspect what parts of the output you need.

    
             
                                                        
            
            
              
                
                0
              
                   
                
               讨论(0)
              
                                                  
              
              
                          
             
       
          
              
                                    
                         
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
                              			
        
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复