I am trying to build an application that records an audio which is then sent to a Speech-to-Text API to recieve its transcription. I want the appliaction to be as hands-free as