Why does autograd not produce gradient for intermediate variables?

末鹿安然 提交于 2019-11-29 18:54:18

问题


trying to wrap my head around how gradients are represented and how autograd works:

import torch
from torch.autograd import Variable

x = Variable(torch.Tensor([2]), requires_grad=True)
y = x * x
z = y * y

z.backward()

print(x.grad)
#Variable containing:
#32
#[torch.FloatTensor of size 1]

print(y.grad)
#None

Why does it not produce a gradient for y? If y.grad = dz/dy, then shouldn't it at least produce a variable like y.grad = 2*y?


回答1:


By default, gradients are only retained for leaf variables. non-leaf variables' gradients are not retained to be inspected later. This was done by design, to save memory.

-soumith chintala

See: https://discuss.pytorch.org/t/why-cant-i-see-grad-of-an-intermediate-variable/94

Option 1:

Call y.retain_grad()

x = Variable(torch.Tensor([2]), requires_grad=True)
y = x * x
z = y * y

y.retain_grad()

z.backward()

print(y.grad)
#Variable containing:
# 8
#[torch.FloatTensor of size 1]

Source: https://discuss.pytorch.org/t/why-cant-i-see-grad-of-an-intermediate-variable/94/16

Option 2:

Register a hook, which is basically a function called when that gradient is calculated. Then you can save it, assign it, print it, whatever...

from __future__ import print_function
import torch
from torch.autograd import Variable

x = Variable(torch.Tensor([2]), requires_grad=True)
y = x * x
z = y * y

y.register_hook(print) ## this can be anything you need it to be

z.backward()

output:

Variable containing:  8 [torch.FloatTensor of size 1

Source: https://discuss.pytorch.org/t/why-cant-i-see-grad-of-an-intermediate-variable/94/2

Also see: https://discuss.pytorch.org/t/why-cant-i-see-grad-of-an-intermediate-variable/94/7



来源:https://stackoverflow.com/questions/45988168/why-does-autograd-not-produce-gradient-for-intermediate-variables

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!