CS231n: How to calculate gradient for Softmax loss function?

后端 未结 2 877
野趣味
野趣味 2021-01-31 03:43

I am watching some videos for Stanford CS231: Convolutional Neural Networks for Visual Recognition but do not quite understand how to calculate analytical gradient for softmax l

2条回答
  •  南笙
    南笙 (楼主)
    2021-01-31 04:10

    Not sure if this helps, but:

    y_i is really the indicator function y_i, as described here. This forms the expression (j == y[i]) in the code.

    Also, the gradient of the loss with respect to the weights is:

    y_i

    where

    y_i

    which is the origin of the X[:,i] in the code.

提交回复
热议问题