I see that there are advanced ML projects already that does text to speech such as SV2TTS : https://github.com/CorentinJ/Real-Time-Voice-Cloning
However what I am loo