If I use "fit" in training my network. The loss function converges, and the metric (accuracy) is significantly improved.
However, I write the training proce