PyTorch Autograd automatic differentiation feature

前端 未结 1 1185
盖世英雄少女心
盖世英雄少女心 2020-12-20 22:49

I am just curious to know, how does PyTorch track operations on tensors (after the .requires_grad is set as True and how does it later calculate th

相关标签:
1条回答
  • 2020-12-20 23:44

    That's a great question! Generally, the idea of automatic differentiation (AutoDiff) is based on the multivariable chain rule, i.e. \frac{\partial x}{\partial z} = \frac{\partial x}{\partial y}\cdot \frac{\partial y}{\partial z} .
    What this means is that you can express the derivative of x with respect to z via a "proxy" variable y; in fact, that allows you to break up almost any operation in a bunch of simpler (or atomic) operations that can then be "chained" together.
    Now, what AutoDiff packages like Autograd do, is simply to store the derivative of such an atomic operation block, e.g., a division, multiplication, etc. Then, at runtime, your provided forward pass formula (consisting of multiple of these blocks) can be easily turned into an exact derivative. Likewise, you can also provide derivatives for your own operations, should you think AutoDiff does not exactly do what you want it to.

    The advantage of AutoDiff over derivative approximations like finite differences is simply that this is an exact solution.

    If you are further interested in how it works internally, I highly recommend the AutoDidact project, which aims to simplify the internals of an automatic differentiator, since there is usually also a lot of code optimization involved. Also, this set of slides from a lecture I took was really helpful in understanding.

    0 讨论(0)
提交回复
热议问题