Can anyone help me understand the output of BERT in the last hidden layer for sequence classification? I am doing some testing with a huggingface model for sequence classificat