Building recurrent neural network with feed forward network in pytorch

血红的双手。 提交于 2019-12-11 07:57:36

问题


I was going through this tutorial. I have a question about the following class code:

class RNN(nn.Module):
    def __init__(self, input_size, hidden_size, output_size):
        super(RNN, self).__init__()

        self.input_size = input_size
        self.hidden_size = hidden_size
        self.output_size = output_size

        self.i2h = nn.Linear(input_size + hidden_size, hidden_size)
        self.i2o = nn.Linear(input_size + hidden_size, output_size)
        self.softmax = nn.LogSoftmax()

    def forward(self, input, hidden):
        combined = torch.cat((input, hidden), 1)
        hidden = self.i2h(combined)
        output = self.i2o(combined)
        output = self.softmax(output)
        return output, hidden

    def init_hidden(self):
        return Variable(torch.zeros(1, self.hidden_size))

This code was taken from Here. There it was mentioned that

Since the state of the network is held in the graph and not in the layers, you can simply create an nn.Linear and reuse it over and over again for the recurrence.

What I don't understand is, how can one just increase input feature size in nn.Linear and say it is a RNN. What am I missing here?


回答1:


The network is recurrent, because you evaluate multiple timesteps in the example. The following code is also taken from the pytorch tutorial you linked to.

loss_fn = nn.MSELoss()

batch_size = 10
TIMESTEPS = 5

# Create some fake data
batch = torch.randn(batch_size, 50)
hidden = torch.zeros(batch_size, 20)
target = torch.zeros(batch_size, 10)
loss = 0
for t in range(TIMESTEPS):
    # yes! you can reuse the same network several times,
    # sum up the losses, and call backward!
    hidden, output = rnn(batch, hidden)
    loss += loss_fn(output, target)
loss.backward()

So the network itself is not recurrent, but in this loop you use it as a recurrent network by feeding the hidden state of the previous forward step together with your batch-input multiple times.

You could also use it non-recurrent by just backpropagating the loss in every step and ignoring the hidden state.

Since the state of the network is held in the graph and not in the layers, you can simply create an nn.Linear and reuse it over and over again for the recurrence.

This means, that the information to compute the gradient is not held in the model itself, so you can append multiple evaluations of the module to the graph and then backpropagate through the full graph. This is described in the previous paragraphs of the tutorial.



来源:https://stackoverflow.com/questions/51152658/building-recurrent-neural-network-with-feed-forward-network-in-pytorch

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!