How do you match only valid roman numerals with a regular expression?

前端 未结 16 2237
無奈伤痛
無奈伤痛 2020-11-22 02:44

Thinking about my other problem, i decided I can\'t even create a regular expression that will match roman numerals (let alone a context-free grammar that will generate them

相关标签:
16条回答
  • 2020-11-22 03:24

    This works in Java and PCRE regex engines and should now work in the latest JavaScript but may not work in all contexts.

    (?<![A-Z])(M*(CM|CD|D?C{0,3})(XC|XL|L?X{0,3})(IX|IV|V?I{0,3}))(?![A-Z])

    The first part is the atrocious negative lookbehind. But, for logical purposes it is the easiest to understand. Basically, the first (?<!) is saying don't match the middle ([MATCH]) if there are letters coming before the middle ([MATCH]) and the last (?!) is saying don't match the middle ([MATCH]) if there are letters coming after it.

    The middle ([MATCH]) is just the most commonly used regex for matching the sequence of Roman Numerals. But now, you don't want to match that if there are any letters around it.

    See for yourself. https://regexr.com/4vce5

    0 讨论(0)
  • 2020-11-22 03:24

    The problem of the solution from Jeremy and Pax is, that it does also match "nothing".

    The following regex expects at least one roman numeral:

    ^(M{0,4}(CM|CD|D?C{0,3})(XC|XL|L?X{0,3})(IX|IV|V?I{0,3})|[IDCXMLV])$
    
    0 讨论(0)
  • 2020-11-22 03:33

    Fortunately, the range of numbers is limited to 1..3999 or thereabouts. Therefore, you can build up the regex piece-meal.

    <opt-thousands-part><opt-hundreds-part><opt-tens-part><opt-units-part>
    

    Each of those parts will deal with the vagaries of Roman notation. For example, using Perl notation:

    <opt-hundreds-part> = m/(CM|DC{0,3}|CD|C{1,3})?/;
    

    Repeat and assemble.

    Added: The <opt-hundreds-part> can be compressed further:

    <opt-hundreds-part> = m/(C[MD]|D?C{0,3})/;
    

    Since the 'D?C{0,3}' clause can match nothing, there's no need for the question mark. And, most likely, the parentheses should be the non-capturing type - in Perl:

    <opt-hundreds-part> = m/(?:C[MD]|D?C{0,3})/;
    

    Of course, it should all be case-insensitive, too.

    You can also extend this to deal with the options mentioned by James Curran (to allow XM or IM for 990 or 999, and CCCC for 400, etc).

    <opt-hundreds-part> = m/(?:[IXC][MD]|D?C{0,4})/;
    
    0 讨论(0)
  • 2020-11-22 03:34

    I would write functions to my work for me. Here are two roman numeral functions in PowerShell.

    function ConvertFrom-RomanNumeral
    {
      <#
        .SYNOPSIS
            Converts a Roman numeral to a number.
        .DESCRIPTION
            Converts a Roman numeral - in the range of I..MMMCMXCIX - to a number.
        .EXAMPLE
            ConvertFrom-RomanNumeral -Numeral MMXIV
        .EXAMPLE
            "MMXIV" | ConvertFrom-RomanNumeral
      #>
        [CmdletBinding()]
        [OutputType([int])]
        Param
        (
            [Parameter(Mandatory=$true,
                       HelpMessage="Enter a roman numeral in the range I..MMMCMXCIX",
                       ValueFromPipeline=$true,
                       Position=0)]
            [ValidatePattern("^M{0,3}(CM|CD|D?C{0,3})(XC|XL|L?X{0,3})(IX|IV|V?I{0,3})$")]
            [string]
            $Numeral
        )
    
        Begin
        {
            $RomanToDecimal = [ordered]@{
                M  = 1000
                CM =  900
                D  =  500
                CD =  400
                C  =  100
                XC =   90
                L  =   50
                X  =   10
                IX =    9
                V  =    5
                IV =    4
                I  =    1
            }
        }
        Process
        {
            $roman = $Numeral + " "
            $value = 0
    
            do
            {
                foreach ($key in $RomanToDecimal.Keys)
                {
                    if ($key.Length -eq 1)
                    {
                        if ($key -match $roman.Substring(0,1))
                        {
                            $value += $RomanToDecimal.$key
                            $roman  = $roman.Substring(1)
                            break
                        }
                    }
                    else
                    {
                        if ($key -match $roman.Substring(0,2))
                        {
                            $value += $RomanToDecimal.$key
                            $roman  = $roman.Substring(2)
                            break
                        }
                    }
                }
            }
            until ($roman -eq " ")
    
            $value
        }
        End
        {
        }
    }
    
    function ConvertTo-RomanNumeral
    {
      <#
        .SYNOPSIS
            Converts a number to a Roman numeral.
        .DESCRIPTION
            Converts a number - in the range of 1 to 3,999 - to a Roman numeral.
        .EXAMPLE
            ConvertTo-RomanNumeral -Number (Get-Date).Year
        .EXAMPLE
            (Get-Date).Year | ConvertTo-RomanNumeral
      #>
        [CmdletBinding()]
        [OutputType([string])]
        Param
        (
            [Parameter(Mandatory=$true,
                       HelpMessage="Enter an integer in the range 1 to 3,999",
                       ValueFromPipeline=$true,
                       Position=0)]
            [ValidateRange(1,3999)]
            [int]
            $Number
        )
    
        Begin
        {
            $DecimalToRoman = @{
                Ones      = "","I","II","III","IV","V","VI","VII","VIII","IX";
                Tens      = "","X","XX","XXX","XL","L","LX","LXX","LXXX","XC";
                Hundreds  = "","C","CC","CCC","CD","D","DC","DCC","DCCC","CM";
                Thousands = "","M","MM","MMM"
            }
    
            $column = @{Thousands = 0; Hundreds = 1; Tens = 2; Ones = 3}
        }
        Process
        {
            [int[]]$digits = $Number.ToString().PadLeft(4,"0").ToCharArray() |
                                ForEach-Object { [Char]::GetNumericValue($_) }
    
            $RomanNumeral  = ""
            $RomanNumeral += $DecimalToRoman.Thousands[$digits[$column.Thousands]]
            $RomanNumeral += $DecimalToRoman.Hundreds[$digits[$column.Hundreds]]
            $RomanNumeral += $DecimalToRoman.Tens[$digits[$column.Tens]]
            $RomanNumeral += $DecimalToRoman.Ones[$digits[$column.Ones]]
    
            $RomanNumeral
        }
        End
        {
        }
    }
    
    0 讨论(0)
提交回复
热议问题