What is the effect of temperature T in Knowledge Distillation loss function?

后端 未结 0 1906
囚心锁ツ
囚心锁ツ 2020-12-23 16:14

In the formulation of Knowledge Distillation (KD), the temperature T suquare is always multiplied with soft labels loss fucntion because the paper "Distilling the Knowl

相关标签:
回答
  • 消灭零回复
提交回复
热议问题