tensor的backward()用于求导,如下对y=x^2求导,dy=2x=6.0
import torch
x = torch.tensor(3., requires_grad=True)
y = x*x
y.backward()
print(x.grad) # dy/dx = 2x=6.0
如果x的值非标量,即有多个值,会出现错误:RuntimeError: grad can be implicitly created only for scalar outputs
import torch
x = torch.tensor([3., 2., 4.], requires_grad=True)
y = x*x
y.backward()
print(x.grad) # RuntimeError: grad can be implicitly created only for scalar outputs
一种解决方法是,对y进行求平均,相当于在原先求导之后再进行平均
import torch
x = torch.tensor([3., 2., 4.], requires_grad=True)
y = x*x
y = y.mean()
y.backward()
print(x.grad) # dy/dx = 2x= [6.0/2, 4./3, 8./3]
方法二是在backward指定参数,计算导数相当于在增加一个权重值, 参数retain_graph表示是否对前一次求导时进行累加。如果retain_graph的值不指定为True,同时y不再次进行赋值,则无法进行再次y.backward()
import torch
x = torch.tensor([3., 2., 4.], requires_grad=True)
y = x*x
weight = torch.ones(3)
y.backward(weight, retain_graph=True)
print(x.grad) # dy/dx = 2x*weight= [6.0*1, 4*1, 8*1]
来源:oschina
链接:https://my.oschina.net/u/4228078/blog/4319267