CS231n: How to calculate gradient for Softmax loss function?

后端未结

关注

 2  879

I am watching some videos for Stanford CS231: Convolutional Neural Networks for Visual Recognition but do not quite understand how to calculate analytical gradient for softmax l

相关标签:

2条回答

南笙

2021-01-31 04:10

Not sure if this helps, but:

$y_i$ is really the indicator function $Ind\{y_i=j\}$ , as described here. This forms the expression (j == y[i]) in the code.

Also, the gradient of the loss with respect to the weights is:

$\frac{dL}{dW} = \frac{dL}{df} \frac{df}{dW$

where

$\frac{df}{dW} = X_i$

which is the origin of the X[:,i] in the code.

0 讨论(0)
发布评论:

提交评论
- 加载中...
南笙

2021-01-31 04:13

I know this is late but here's my answer:

I'm assuming you are familiar with the cs231n Softmax loss function. We know that:

So just as we did with the SVM loss function the gradients are as follows:

Hope that helped.

0 讨论(0)
发布评论:

提交评论
- 加载中...