I am cloning this repo to try to implement the algorithm of "text-to-speech model" with data collected previously with Google Colab online, what I did first: