CS231n: How to calculate gradient for Softmax loss function?

后端未结

关注

 2  877

野趣味 2021-01-31 03:43

I am watching some videos for Stanford CS231: Convolutional Neural Networks for Visual Recognition but do not quite understand how to calculate analytical gradient for softmax l

2条回答

南笙 (楼主)

2021-01-31 04:10

Not sure if this helps, but:

$y_i$ is really the indicator function $Ind\{y_i=j\}$ , as described here. This forms the expression (j == y[i]) in the code.

Also, the gradient of the loss with respect to the weights is:

$\frac{dL}{dW} = \frac{dL}{df} \frac{df}{dW$

where

$\frac{df}{dW} = X_i$

which is the origin of the X[:,i] in the code.

0 讨论(0)

查看其它2个回答
发布评论:

提交评论
- 加载中...