Why softmax classifier gradient is divided by batch size (cs231n)?

前端 未结 0 1687
一整个雨季
一整个雨季 2020-12-13 19:33

Question

In the cs231 Computing the Analytic Gradient with Backpropagation which is first implementing a Softmax Classifier, the gradient from (softmax + log loss)

相关标签:
回答
  • 消灭零回复
提交回复
热议问题