CS231n: How to calculate gradient for Softmax loss function?

后端 未结 2 866
野趣味
野趣味 2021-01-31 03:43

I am watching some videos for Stanford CS231: Convolutional Neural Networks for Visual Recognition but do not quite understand how to calculate analytical gradient for softmax l

相关标签:
2条回答
  • 2021-01-31 04:10

    Not sure if this helps, but:

    y_i is really the indicator function y_i, as described here. This forms the expression (j == y[i]) in the code.

    Also, the gradient of the loss with respect to the weights is:

    y_i

    where

    y_i

    which is the origin of the X[:,i] in the code.

    0 讨论(0)
  • 2021-01-31 04:13

    I know this is late but here's my answer:

    I'm assuming you are familiar with the cs231n Softmax loss function. We know that: enter image description here

    So just as we did with the SVM loss function the gradients are as follows: enter image description here

    Hope that helped.

    0 讨论(0)
提交回复
热议问题