I follow this tutorial How to train a new language model from scratch using Transformers and Tokenizers.
In Section 2. Train a tokenizer, after training by my own Vietnam