In the formulation of Knowledge Distillation (KD), the temperature T suquare is always multiplied with soft labels loss fucntion because the paper "Distilling the Knowl