While learning the model, I check gpu\'s usage to monitoring, and I found something that made me curious. The usage of cuda with following graph shows drop to 0% once in mid