Regex Problem Group Name Redefinition?

后端 未结 3 1069
情书的邮戳
情书的邮戳 2021-02-05 17:32

So I have this regex:

(^(\\s+)?(?P(\\w)(\\d{7}))((01f\\.foo)|(\\.bar|\\.goo\\.moo\\.roo))$|(^(\\s+)?(?PR1_\\d{6}_\\d{6}_)((01f\\.foo)|(\         


        
相关标签:
3条回答
  • 2021-02-05 17:56

    Reusing the same name makes sense in your case, contrary to Tamalak's reply.

    Your regex compiles with python2.7 and also re2. Maybe this problem has been resolved.

    0 讨论(0)
  • 2021-02-05 18:04

    The following answer deals with how to make the above regex work in Python3.

    Since the re2 module as suggested by Max would not work in Python3, because of the NameError: basestring. Another alternative to this is the regex module.

    regex module is just an enhanced version of re with extra added features. This module also allows to have same group names in the regex.

    You can install it via:

    sudo pip install regex
    

    And if you have already been using re or re2 in your program. Just do the following to import regex module

    import regex as re
    
    0 讨论(0)
  • 2021-02-05 18:07

    No, you can't have two groups of the same name, this would somehow defy the purpose, wouldn't it?

    What you probably really want is this:

    ^\s*(?P<NAME>\w\d{7}|R1_(?:\d{6}_){2})(01f\.foo|\.(?:bar|goo|moo|roo))$
    

    I refactored your regex as far as possible. I made the following assumptions:

    You want to (correct me if I'm wrong):

    • ignore white space at the start of the string
    • match either of the following into a group named "NAME":
      • a letter followed by 7 digits, or
      • "R1_", and two times (6 digits + "_")
    • followed by either:
      • "01f.foo" or
      • "." and ("bar" or "goo" or "moo" or "roo")
    • followed by the end of the string

    You could also have meant:

    ^\s*(?P<NAME>\w\d{7}01f|R1_(?:\d{6}_){2})\.(?:foo|bar|goo|moo|roo)$
    

    Which is:

    • ignore white space at the start of the string
    • match either of the following into a group named "NAME":
      • a letter followed by 7 digits and "01f"
      • "R1_", and two times (6 digits + "_")
    • a dot
    • "foo", "bar", "goo", "moo" or "roo"
    • the end of the string
    0 讨论(0)
提交回复
热议问题