Understanding a simple LSTM pytorch

前端未结

关注

 3  1213

import torch,ipdb
import torch.autograd as autograd
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim
from torch.autograd import Variable


                      
              相关标签:


      
      
        
          3条回答        

        
                         				            
            
           
            
                              
                
              
              
                
                  你的背包        
                
              
                            
                2021-01-30 04:55
              
            
            
                                                                       
You can set 


  batch_first = True 


if you want to make input and output provided as 


  (batch_size, seq, input_size)


I got to know it today, so sharing with you.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  长发绾君心        
                
              
                            
                2021-01-30 05:02
              
            
            
                                                                       
Answer by cdo256 is almost correct. He is mistaken when referring to what hidden_size means. He explains it as:

hidden_size - the number of LSTM blocks per layer.

but really, here is a better explanation:

Each sigmoid, tanh or hidden state layer in the cell is actually a set of nodes, whose number is equal to the hidden layer size. Therefore each of the “nodes” in the LSTM cell is actually a cluster of normal neural network nodes, as in each layer of a densely connected neural network.
Hence, if you set hidden_size = 10, then each one of your LSTM blocks, or cells, will have neural networks with 10 nodes in them.
The total number of LSTM blocks in your LSTM model will be equivalent to that of your sequence length.

This can be seen by analyzing the differences in examples between nn.LSTM and nn.LSTMCell:

https://pytorch.org/docs/stable/nn.html#torch.nn.LSTM

and

https://pytorch.org/docs/stable/nn.html#torch.nn.LSTMCell
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  傲寒        
                
              
                            
                2021-01-30 05:05
              
            
            
                                                                       
The output for the LSTM is the output for all the hidden nodes on the final layer.

hidden_size - the number of LSTM blocks per layer.

input_size - the number of input features per time-step.

num_layers - the number of hidden layers.

In total there are hidden_size * num_layers LSTM blocks.

The input dimensions are (seq_len, batch, input_size).

seq_len - the number of time steps in each input stream.

batch - the size of each batch of input sequences.

The hidden and cell dimensions are: (num_layers, batch, hidden_size)


  output (seq_len, batch, hidden_size * num_directions): tensor containing the output features (h_t) from the last layer of the RNN, for each t.


So there will be hidden_size * num_directions outputs. You didn't initialise the RNN to be bidirectional so num_directions is 1. So output_size = hidden_size.

Edit: You can change the number of outputs by using a linear layer:

out_rnn, hn = rnn(input, (h0, c0))
lin = nn.Linear(hidden_size, output_size)
v1 = nn.View(seq_len*batch, hidden_size)
v2 = nn.View(seq_len, batch, output_size)
output = v2(lin(v1(out_rnn)))


Note: for this answer I assumed that we're only talking about non-bidirectional LSTMs.

Source: PyTorch docs.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
                             
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复