Parsing srt subtitles

后端 未结 6 1766
粉色の甜心
粉色の甜心 2021-02-08 12:31

I want to parse srt subtitles:

    1
    00:00:12,815 --> 00:00:14,509
    Chlapi, jak to jde s
    těma pracovníma světlama?.

    2
    00:00:14,815 -->          


        
6条回答
  •  悲&欢浪女
    2021-02-08 12:56

    I became quite frustrated with srt libraries available for Python (often because they were heavyweight and eschewed language-standard types in favour of custom classes), so I've spent the last year or so working on my own srt library. You can get it at https://github.com/cdown/srt.

    I tried to keep it simple and light on classes (except for the core Subtitle class, which more or less just stores the SRT block data). It can read and write SRT files, and turn noncompliant SRT files into compliant ones.

    Here's a usage example with your sample input:

    >>> import srt, pprint
    >>> gen = srt.parse('''\
    ... 1
    ... 00:00:12,815 --> 00:00:14,509
    ... Chlapi, jak to jde s
    ... těma pracovníma světlama?.
    ... 
    ... 2
    ... 00:00:14,815 --> 00:00:16,498
    ... Trochu je zesilujeme.
    ... 
    ... 3
    ... 00:00:16,934 --> 00:00:17,814
    ... Jo, sleduj.
    ... 
    ... ''')
    >>> pprint.pprint(list(gen))
    [Subtitle(start=datetime.timedelta(0, 12, 815000), end=datetime.timedelta(0, 14, 509000), index=1, proprietary='', content='Chlapi, jak to jde s\ntěma pracovníma světlama?.'),
     Subtitle(start=datetime.timedelta(0, 14, 815000), end=datetime.timedelta(0, 16, 498000), index=2, proprietary='', content='Trochu je zesilujeme.'),
     Subtitle(start=datetime.timedelta(0, 16, 934000), end=datetime.timedelta(0, 17, 814000), index=3, proprietary='', content='Jo, sleduj.')]
    

提交回复
热议问题