In order to access a model\'s parameters in pytorch, I saw two methods:
using state_dict and using parameters()
I wonder what\'s the difference, or if one is good
Besides the differences in @kHarshit 's answer, the attribute requires_grad
of trainable tensors in net.parameters()
is True
, while False
in net.state_dict()
The parameters()
only gives the module parameters i.e. weights and biases.
Returns an iterator over module parameters.
You can check the list of the parameters as follows:
for name, param in model.named_parameters():
if param.requires_grad:
print(name)
On the other hand, state_dict returns a dictionary containing a whole state of the module. Check its source code that contains not just the call to parameters
but also buffers
, etc.
Both parameters and persistent buffers (e.g. running averages) are included. Keys are the corresponding parameter and buffer names.
Check all keys that state_dict
contains using:
model.state_dict().keys()
For example, in state_dict
, you'll find entries like bn1.running_mean
and running_var
, which are not present in .parameters()
.
If you only want to access parameters, you can simply use .parameters()
, while for purposes like saving and loading model as in transfer learning, you'll need to save state_dict
not just parameters.