I want to parse srt subtitles:
1 00:00:12,815 --> 00:00:14,509 Chlapi, jak to jde s těma pracovníma světlama?. 2 00:00:14,815 -->
splits = [s.strip() for s in re.split(r'\n\s*\n', text) if s.strip()] regex = re.compile(r'''(?P\d+).*?(?P\d{2}:\d{2}:\d{2},\d{3}) --> (?P\d{2}:\d{2}:\d{2},\d{3})\s*.*?\s*(?P.*)''', re.DOTALL) for s in splits: r = regex.search(s) print r.groups()