I am using the last command from this SO answer https://stackoverflow.com/a/54818581/80353
cap()(cd /tmp;rm -f *.vtt;youtube-dl --skip-download --write-auto-sub
Thank You @KimStacks @RavinderSingh13 @Oguz-Ismail for posting these solutions above and in the previous post
I managed to get results in the .vtt file with youtube-dl --skip-download --write-auto-sub $youtube_url
However, the format of the output is not ideal for my purpose. I have to delete line by line in order to remove the time as well as the /n
new line. So I would like to customize the code syntax to fit my requirements.
NOTE: Not sure whether it's a new query or not, so I will post it here for now:
How to insert the "$youtube_url" inside the code below?
cap()(cd /tmp;rm -f *.vtt;youtube-dl --skip-download --write-auto-sub "$1";\
sed '1,/^$/d' *.vtt|sed 's/<[^>]*>//g'|awk -F. 'NR%8==1{printf"%s ",$1}NR%8==3'\
|tee -a "$2")
'NR%8==1{printf"%s ",$1}NR%8==3'
, on both ends but not successfully getting the right format inside the .vtt file. Thus, Is it possible to have:transcripted text printed continously as sentences, rather than each subtitle printed as new lines?
remove printout of start time?