My goal is to create a program which detects voice activity in audio file. The program should then cut the original audio file so that only the part where voice is detected is p