I am using my customized bert script to train a model. However, everything even I keep the same setting for lr, AdamW weight decay and epoch, and run on the same platform (c