问题
This is the model I defined it is a simple lstm with 2 fully connect layers.
import copy
import torch
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim
class mylstm(nn.Module):
def __init__(self,input_dim, output_dim, hidden_dim,linear_dim):
super(mylstm, self).__init__()
self.hidden_dim=hidden_dim
self.lstm=nn.LSTMCell(input_dim,self.hidden_dim)
self.linear1=nn.Linear(hidden_dim,linear_dim)
self.linear2=nn.Linear(linear_dim,output_dim)
def forward(self, input):
out,_=self.lstm(input)
out=nn.Dropout(p=0.3)(out)
out=self.linear1(out)
out=nn.Dropout(p=0.3)(out)
out=self.linear2(out)
return out
x_train
and x_val
are float dataframe with shape (4478,30)
, while y_train
and y_val
are float df with shape (4478,10)
x_train.head()
Out[271]:
0 1 2 3 ... 26 27 28 29
0 1.6110 1.6100 1.6293 1.6370 ... 1.6870 1.6925 1.6950 1.6905
1 1.6100 1.6293 1.6370 1.6530 ... 1.6925 1.6950 1.6905 1.6960
2 1.6293 1.6370 1.6530 1.6537 ... 1.6950 1.6905 1.6960 1.6930
3 1.6370 1.6530 1.6537 1.6620 ... 1.6905 1.6960 1.6930 1.6955
4 1.6530 1.6537 1.6620 1.6568 ... 1.6960 1.6930 1.6955 1.7040
[5 rows x 30 columns]
x_train.shape
Out[272]: (4478, 30)
Define the varible and do one time bp, I can find out the vaildation loss is 1.4941
model=mylstm(30,10,200,100).double()
from torch import optim
optimizer=optim.RMSprop(model.parameters(), lr=0.001, alpha=0.9)
criterion=nn.L1Loss()
input_=torch.autograd.Variable(torch.from_numpy(np.array(x_train)))
target=torch.autograd.Variable(torch.from_numpy(np.array(y_train)))
input2_=torch.autograd.Variable(torch.from_numpy(np.array(x_val)))
target2=torch.autograd.Variable(torch.from_numpy(np.array(y_val)))
optimizer.zero_grad()
output=model(input_)
loss=criterion(output,target)
loss.backward()
optimizer.step()
moniter=criterion(model(input2_),target2)
moniter
Out[274]: tensor(1.4941, dtype=torch.float64, grad_fn=<L1LossBackward>)
But I called forward function again I get a different number due to randomness of dropout
moniter=criterion(model(input2_),target2)
moniter
Out[275]: tensor(1.4943, dtype=torch.float64, grad_fn=<L1LossBackward>)
what should I do that I can eliminate all the dropout in predicting phrase?
I tried eval()
:
moniter=criterion(model.eval()(input2_),target2)
moniter
Out[282]: tensor(1.4942, dtype=torch.float64, grad_fn=<L1LossBackward>)
moniter=criterion(model.eval()(input2_),target2)
moniter
Out[283]: tensor(1.4945, dtype=torch.float64, grad_fn=<L1LossBackward>)
And pass an addtional parameter p to control dropout:
import copy
import torch
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim
class mylstm(nn.Module):
def __init__(self,input_dim, output_dim, hidden_dim,linear_dim,p):
super(mylstm, self).__init__()
self.hidden_dim=hidden_dim
self.lstm=nn.LSTMCell(input_dim,self.hidden_dim)
self.linear1=nn.Linear(hidden_dim,linear_dim)
self.linear2=nn.Linear(linear_dim,output_dim)
def forward(self, input,p):
out,_=self.lstm(input)
out=nn.Dropout(p=p)(out)
out=self.linear1(out)
out=nn.Dropout(p=p)(out)
out=self.linear2(out)
return out
model=mylstm(30,10,200,100,0.3).double()
output=model(input_)
loss=criterion(output,target)
loss.backward()
optimizer.step()
moniter=criterion(model(input2_,0),target2)
Traceback (most recent call last):
File "<ipython-input-286-e49b6fac918b>", line 1, in <module>
output=model(input_)
File "D:\Users\shan xu\Anaconda3\lib\site-packages\torch\nn\modules\module.py", line 489, in __call__
result = self.forward(*input, **kwargs)
TypeError: forward() missing 1 required positional argument: 'p'
But neither of them worked.
回答1:
You have to define your nn.Dropout
layer in your __init__
and assign it to your model to be responsive for calling eval()
.
So changing your model like this should work for you:
class mylstm(nn.Module):
def __init__(self,input_dim, output_dim, hidden_dim,linear_dim,p):
super(mylstm, self).__init__()
self.hidden_dim=hidden_dim
self.lstm=nn.LSTMCell(input_dim,self.hidden_dim)
self.linear1=nn.Linear(hidden_dim,linear_dim)
self.linear2=nn.Linear(linear_dim,output_dim)
# define dropout layer in __init__
self.drop_layer = nn.Dropout(p=p)
def forward(self, input):
out,_= self.lstm(input)
# apply model dropout, responsive to eval()
out= self.drop_layer(out)
out= self.linear1(out)
# apply model dropout, responsive to eval()
out= self.drop_layer(out)
out= self.linear2(out)
return out
If you change it like this dropout will be inactive as soon as you call eval()
.
NOTE: If you want to continue training afterwards you need to call train()
on your model to leave evaluation mode.
You can also find a small working example for dropout with eval()
for evaluation mode here:
nn.Dropout vs. F.dropout pyTorch
回答2:
I add this answer just because I'm facing now the same issue while trying to reproduce Deep Bayesian active learning through dropout disagreement. If you need to keep dropout active (for example to bootstrap a set of different predictions for the same test instances) you just need to leave the model in training mode, there is no need to define your own dropout layer.
Since in pytorch you need to define your own prediction function, you can just add a parameter to it like this:
def predict_class(model, test_instance, active_dropout=False):
if active_dropout:
model.train()
else:
model.eval()
回答3:
As the other answers said, the dropout layer is desired to be defined in your model's __init__
method, so that your model can keep track of all information of each pre-defined layer. When the model's state is changed, it would notify all layers and do some relevant work. For instance, while calling model.eval()
your model would deactivate the dropout layers but directly pass all activations. In general, if you wanna deactivate your dropout layers, you'd better define the dropout layers in __init__
method using nn.Dropout
module.
来源:https://stackoverflow.com/questions/53879727/pytorch-how-to-deactivate-dropout-in-evaluation-mode