I created a language model from scratch with BertForMaskedLM using my own domain dataset.
Now I want to assess whether the model is good so I would like to calculate