How to convert YouTube API duration to seconds?

前端 未结 7 1502
星月不相逢
星月不相逢 2021-02-07 04:52

For the sake of interest I want to convert video durations from YouTubes ISO 8601 to seconds. To future proof my solution, I picked a really long video to test it a

相关标签:
7条回答
  • 2021-02-07 05:25

    Python's built-in dateutil module only supports parsing ISO 8601 dates, not ISO 8601 durations. For that, you can use the "isodate" library (in pypi at https://pypi.python.org/pypi/isodate -- install through pip or easy_install). This library has full support for ISO 8601 durations, converting them to datetime.timedelta objects. So once you've imported the library, it's as simple as:

    dur=isodate.parse_duration('P1W2DT6H21M32S')
    print dur.total_seconds()
    
    0 讨论(0)
  • 2021-02-07 05:36

    Here's my answer which takes 9000's regex solution (thank you - amazing mastery of regex!) and finishes the job for the original poster's YouTube use case i.e. converting hours, minutes, and seconds to seconds. I used .groups() instead of .groupdict(), followed by a couple of lovingly constructed list comprehensions.

    import re
    
    def yt_time(duration="P1W2DT6H21M32S"):
        """
        Converts YouTube duration (ISO 8061)
        into Seconds
    
        see http://en.wikipedia.org/wiki/ISO_8601#Durations
        """
        ISO_8601 = re.compile(
            'P'   # designates a period
            '(?:(?P<years>\d+)Y)?'   # years
            '(?:(?P<months>\d+)M)?'  # months
            '(?:(?P<weeks>\d+)W)?'   # weeks
            '(?:(?P<days>\d+)D)?'    # days
            '(?:T' # time part must begin with a T
            '(?:(?P<hours>\d+)H)?'   # hours
            '(?:(?P<minutes>\d+)M)?' # minutes
            '(?:(?P<seconds>\d+)S)?' # seconds
            ')?')   # end of time part
        # Convert regex matches into a short list of time units
        units = list(ISO_8601.match(duration).groups()[-3:])
        # Put list in ascending order & remove 'None' types
        units = list(reversed([int(x) if x != None else 0 for x in units]))
        # Do the maths
        return sum([x*60**units.index(x) for x in units])
    

    Sorry for not posting higher up - still new here and not enough reputation points to add comments.

    0 讨论(0)
  • 2021-02-07 05:37

    Works on python 2.7+. Adopted from a JavaScript one-liner for Youtube v3 question here.

    import re
    
    def YTDurationToSeconds(duration):
      match = re.match('PT(\d+H)?(\d+M)?(\d+S)?', duration).groups()
      hours = _js_parseInt(match[0]) if match[0] else 0
      minutes = _js_parseInt(match[1]) if match[1] else 0
      seconds = _js_parseInt(match[2]) if match[2] else 0
      return hours * 3600 + minutes * 60 + seconds
    
    # js-like parseInt
    # https://gist.github.com/douglasmiranda/2174255
    def _js_parseInt(string):
        return int(''.join([x for x in string if x.isdigit()]))
    
    # example output 
    YTDurationToSeconds(u'PT15M33S')
    # 933
    

    Handles iso8061 duration format to extent Youtube Uses up to hours

    0 讨论(0)
  • 2021-02-07 05:45

    Extending on 9000's answer, apparently Youtube's format is using weeks, but not months which means total seconds can be easily computed.
    Not using named groups here because I initially needed this to work with PySpark.

    from operator import mul
    from itertools import accumulate
    import re
    from typing import Pattern, List
    
    SECONDS_PER_SECOND: int = 1
    SECONDS_PER_MINUTE: int = 60
    MINUTES_PER_HOUR: int = 60
    HOURS_PER_DAY: int = 24
    DAYS_PER_WEEK: int = 7
    WEEKS_PER_YEAR: int = 52
    
    ISO8601_PATTERN: Pattern = re.compile(
        r"P(?:(\d+)Y)?(?:(\d+)W)?(?:(\d+)D)?"
        r"T(?:(\d+)H)?(?:(\d+)M)?(?:(\d+)S)?"
    )
    
    def extract_total_seconds_from_ISO8601(iso8601_duration: str) -> int:
        """Compute duration in seconds from a Youtube ISO8601 duration format. """
        MULTIPLIERS: List[int] = (
            SECONDS_PER_SECOND, SECONDS_PER_MINUTE, MINUTES_PER_HOUR,
            HOURS_PER_DAY, DAYS_PER_WEEK, WEEKS_PER_YEAR
        )
        groups: List[int] = [int(g) if g is not None else 0 for g in
                  ISO8601_PATTERN.match(iso8601_duration).groups()]
    
        return sum(g * multiplier for g, multiplier in
                   zip(reversed(groups), accumulate(MULTIPLIERS, mul)))
    
    0 讨论(0)
  • 2021-02-07 05:47

    This works by parsing the input string 1 character at a time, if the character is numerical it simply adds it (string add, not mathematical add) to the current value being parsed. If it is one of 'wdhms' the current value is assigned to the appropriate variable (week, day, hour, minute, second), and value is then reset ready to take the next value. Finally it sum the number of seconds from the 5 parsed values.

    def ytDurationToSeconds(duration): #eg P1W2DT6H21M32S
        week = 0
        day  = 0
        hour = 0
        min  = 0
        sec  = 0
    
        duration = duration.lower()
    
        value = ''
        for c in duration:
            if c.isdigit():
                value += c
                continue
    
            elif c == 'p':
                pass
            elif c == 't':
                pass
            elif c == 'w':
                week = int(value) * 604800
            elif c == 'd':
                day = int(value)  * 86400
            elif c == 'h':
                hour = int(value) * 3600
            elif c == 'm':
                min = int(value)  * 60
            elif c == 's':
                sec = int(value)
    
            value = ''
    
        return week + day + hour + min + sec
    
    0 讨论(0)
  • 2021-02-07 05:47

    So this is what I came up with - a custom parser to interpret the time:

    def durationToSeconds(duration):
        """
        duration - ISO 8601 time format
        examples :
            'P1W2DT6H21M32S' - 1 week, 2 days, 6 hours, 21 mins, 32 secs,
            'PT7M15S' - 7 mins, 15 secs
        """
        split   = duration.split('T')
        period  = split[0]
        time    = split[1]
        timeD   = {}
    
        # days & weeks
        if len(period) > 1:
            timeD['days']  = int(period[-2:-1])
        if len(period) > 3:
            timeD['weeks'] = int(period[:-3].replace('P', ''))
    
        # hours, minutes & seconds
        if len(time.split('H')) > 1:
            timeD['hours'] = int(time.split('H')[0])
            time = time.split('H')[1]
        if len(time.split('M')) > 1:
            timeD['minutes'] = int(time.split('M')[0])
            time = time.split('M')[1]    
        if len(time.split('S')) > 1:
            timeD['seconds'] = int(time.split('S')[0])
    
        # convert to seconds
        timeS = timeD.get('weeks', 0)   * (7*24*60*60) + \
                timeD.get('days', 0)    * (24*60*60) + \
                timeD.get('hours', 0)   * (60*60) + \
                timeD.get('minutes', 0) * (60) + \
                timeD.get('seconds', 0)
    
        return timeS
    

    Now it probably is super non-cool and so on, but it works, so I'm sharing because I care about you people.

    0 讨论(0)
提交回复
热议问题