I am building a fine-tuned BERT model for classification (with a linear layer in the end). The prediction should just be 1/0 (Yes, No).
When I am writing the evaluation p