I\'m performing an image classification task . Images are labeled as 0 1 2. Should be the size of the last linear layer in the model output be 3 or 1 ? In general, for a 3-c