Autograd.grad() for Tensor in pytorch

前端未结

关注

 1  1815

I want to compute the gradient between two tensors in a net. The input X tensor (batch size x m) is sent through a set of convolutional layers which give me back and output Y te

相关标签:

1条回答

时光取名叫无心

2021-02-06 02:50

Let's start from simple working example with plain loss function and regular backward. We will build short computational graph and do some grad computations on it.

Code:

import torch
from torch.autograd import grad
import torch.nn as nn


# Create some dummy data.
x = torch.ones(2, 2, requires_grad=True)
gt = torch.ones_like(x) * 16 - 0.5  # "ground-truths" 

# We will use MSELoss as an example.
loss_fn = nn.MSELoss()

# Do some computations.
v = x + 2
y = v ** 2

# Compute loss.
loss = loss_fn(y, gt)

print(f'Loss: {loss}')

# Now compute gradients:
d_loss_dx = grad(outputs=loss, inputs=x)
print(f'dloss/dx:\n {d_loss_dx}')

Output:

Loss: 42.25
dloss/dx:
(tensor([[-19.5000, -19.5000], [-19.5000, -19.5000]]),)

Ok, this works! Now let's try to reproduce error "grad can be implicitly created only for scalar outputs". As you can notice, loss in previous example is a scalar. backward() and grad() by defaults deals with single scalar value: loss.backward(torch.tensor(1.)). If you try to pass tensor with more values you will get an error.

Code:

v = x + 2
y = v ** 2

try:
    dy_hat_dx = grad(outputs=y, inputs=x)
except RuntimeError as err:
    print(err)

Output:

grad can be implicitly created only for scalar outputs

Therefore, when using grad() you need to specify grad_outputs parameter as follows:

Code:

v = x + 2
y = v ** 2

dy_dx = grad(outputs=y, inputs=x, grad_outputs=torch.ones_like(y))
print(f'dy/dx:\n {dy_dx}')

dv_dx = grad(outputs=v, inputs=x, grad_outputs=torch.ones_like(v))
print(f'dv/dx:\n {dv_dx}')

Output:

dy/dx:
(tensor([[6., 6.],[6., 6.]]),)

dv/dx:
(tensor([[1., 1.], [1., 1.]]),)

NOTE: If you are using backward() instead, simply do y.backward(torch.ones_like(y)).

0 讨论(0)