SCORM 2004 Time Format - Regular Expression?

橙三吉。 提交于 2019-12-04 06:30:53
Shannow

Here is the regex i use;

^P(?=\w*\d)(?:\d+Y|Y)?(?:\d+M|M)?(?:\d+D|D)?(?:T(?:\d+H|H)?(?:\d+M|M)?(?:\d+(?:\­.\d{1,2})?S|S)?)?$ 

Use [0-9] to match any numeral. + to match 1 or more repetitions. ? to match 0 or 1 repetitions. () to group and extract the output.

P(([0-9]+Y)?([0-9]+M)?([0-9]+D)?)(T([0-9]+H)?([0-9]+M)?([0-9.]+S)?)?

import re

>>> p = re.compile('P(([0-9]+Y)?([0-9]+M)?([0-9]+D)?)(T([0-9]+H)?([0-9]+M)?([0-9.]+S)?)?')

>>> p.match('P1Y3M2DT3H').groups()
('1Y3M2D', '1Y', '3M', '2D', 'T3H', '3H', None, None)

>>> p.match('P3M2DT3H').groups()
('3M2D', None, '3M', '2D', 'T3H', '3H', None, None)

>>> p.match('PT3H5M').groups()
('', None, None, None, 'T3H5M', '3H', '5M', None)

>>> p.match('P1Y3M4D').groups()
('1Y3M4D', '1Y', '3M', '4D', None, None, None, None)

JavaScript doesn't support /x (free-spacing or comments mode), so remove the whitespace from this regex before using it.

/^P(?=.)
 (?:\d+Y)?
 (?:\d+M)?
 (?:\d+D)?
 (?:T(?=.)
    (?:\d+H)?
    (?:\d+M)?
    (?:\d+
       (?:\.\d{1,2})?
    )?
 )?$/i

Each (?=.) lookahead asserts that there's at least one character remaining at that point in the match. That means at least one of the following groups (ie, the Y, M, D or T group after the P, and the H, M or S group after the T) has to match, even though they're all optional. That satisfies the second of the added requirements in your updated spec.

Maybe it's semantics, but this part of the SCORM spec can be interpreted to mean literals are allowed even if a value isn't supplied:

The character literals designators P, Y, M, D, T, H, M and S shall appear if the corresponding non-zero value is present.

"shall appear" meaning a literal MUST be present if the corresponding number is present; it doesn't say "shall ONLY appear" if the corresponding number is present.

I modified Alan's regex to handle this possibility (thanks, Alan):

^P(?:\d+Y|Y)?(?:\d+M|M)?(?:\d+D|D)?(?:T(?:\d+H|H)?(?:\d+M|M)?(?:\d+(?:\.\d{1,2})?S|S)?)?$

The only bug I've found so far is a failure to flag a string that has no numeric values specified, such as 'PTS'. The minimum according to the spec is "P" followed by a single value and accompanying designation, such as P1Y (= 1 year) or PT0S (= 1 second):

at least one character literal designator and value shall be present in addition to the designator P

There must be a way to add a check for a numeric value to this regex, but my regex-fu is not that strong. :)

For what it's worth, I've adapted the accepted answer for use with Cold Fusion. I thought some folks might find it useful, so I figured I'd post it. As noted above, CF bombed on the seconds implementation above, so I modified it. I'm not sure if that means it's a general RegEx error in the above example, or if CF and JS have different RegEx implementations. Anyway, here's the CF RegEx, complete with comments (because, you know, otherwise regular expressions are complete gibberish):

<cfset regex = "(?x) ## allow for multi-line expression, including comments (oh, glorious comments)
            ^ ## ensure that this expression occurs at the start of the string
            P ## the first character must be a P
            (\d+Y|Y)? ## year (the ? indicates 0 or 1 times)
            (\d+M|M)? ## month
            (\d+D|D)? ## day
            (?:T ## T delineates between day and time information
            (\d+H|H)? ## hour
            (\d+M|M)? ## minute
            (\d+(?:\.\d{1,2})?S|S)? ## seconds and milliseconds.  The inner ?: ensure that the sub-sub-expression isn't returned as a separate thing
            )? ## closes 'T' subexpression
            $ ## ensure that this expression occurs at the end of the string.  In conjunction with leading ^, this ensures that the string has no extraenous characters">

After that, you run it against your string like this:

<cfset result = reFind(regex,mystring,1,true)>

That returns an array of subexpressions, which you can iterate over to get the discreet parts:

<cfloop from=1 to=#arrayLen(result.len)# index=i>
    <cfif result.len[i] GT 0>
    #mid(mystring, result.pos[i], result.len[i])#<br>
    </cfif>
</cfloop>

Our SCORM Engine implementation uses a combination of a regular expression similar to the ones above and some basic JavaScript logic do do further validation.

I'm using this expression:

^P(\d+Y)?(\d+M)?(\d+D)?(T(((\d+H)(\d+M)?(\d+(\.\d{1,2})?S)?)|((\d+M)(\d+(\.\d{1,2})?S)?)|((\d+(\.\d{1,2})?S))))?$

This expression does not match a value like "PYMDT0H" : a digit must accompany the designator to be matched.

Based on the previously accepted answer, I've made this capturing regex for PCRE (PHP, ruby, Ecmascript 2018, ... ): https://regex101.com/r/KfMs1I/6

^P (?=\w*\d) (?:(?<years>\d+)Y|Y)? (?:(?<month>\d+)M|M)? (?:(?<days>\d+)D|D)? (?: T (?:(?<hours>\d+)H|H)? (?:(?<minutes>\d+)M|M)? (?: (?<seconds> \d+ (?: \. \d{1,2} )? )S | S )? )?$

Unfortunately I can't find how to do the same in current JS, because the optional groups cannot be accessed in a reliable way without named groups.

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!