python version 3.4 does not support a 'ur' prefix

前端 未结 3 608
孤街浪徒
孤街浪徒 2021-02-07 09:26

I have some python code writen in an older version of python(2.x) and I struggle to make it work. I\'m using python 3.4

_eng_word = ur\"[a-zA-Z][a-zA-Z0-9\'.]*\         


        
相关标签:
3条回答
  • 2021-02-07 09:47

    This table compares (some of) the different string literal prefixes in Python 2(.7) and 3(.4+):

    As you can see, in Python 3 there's no way to have a literal that doesn't process escapes, but does process unicode literals. To get such a string with code that works in both Python 2 and 3, use:

    br"[a-zA-Z][a-zA-Z0-9'.]*".decode('raw_unicode_escape')
    

    Actually, your example is not very good, since it doesn't have any unicode literals, or escape sequences. A better example would be:

    br"[\u03b1-\u03c9\u0391-\u03a9][\t'.]*".decode('raw_unicode_escape')
    

    In python 2:

    >>> br"[\u03b1-\u03c9\u0391-\u03a9][\t'.]*".decode('raw_unicode_escape')
    u"[\u03b1-\u03c9\u0391-\u03a9][\\t'.]*"
    

    In Python 3:

    >>> br"[\u03b1-\u03c9\u0391-\u03a9][\t'.]*".decode('raw_unicode_escape')
    "[α-ωΑ-Ω][\\t'.]*"
    

    Which is really the same thing.

    0 讨论(0)
  • 2021-02-07 09:50

    Indeed, Python 3.4 only supports u'...' (to support code that needs to run on both Python 2 and 3) and r'....', but not both. That's because the semantics of how ur'..' works in Python 2 are different from how ur'..' would work in Python 3 (in Python 2, \uhhhh and \Uhhhhhhhh escapes still are processed, in Python 3 a `r'...' string would not).

    Note that in this specific case there is no difference between the raw string literal and the regular! You can just use:

    _eng_word = u"[a-zA-Z][a-zA-Z0-9'.]*"
    

    and it'll work in both Python 2 and 3.

    For cases where a raw string literal does matter, you could decode the raw string from raw_unicode_escape on Python 2, catching the AttributeError on Python 3:

    _eng_word = r"[a-zA-Z][a-zA-Z0-9'.]*"
    try:
        # Python 2
        _eng_word = _eng_word.decode('raw_unicode_escape')
    except AttributeError:
        # Python 3
        pass
    

    If you are writing Python 3 code only (so it doesn't have to run on Python 2 anymore), just drop the u entirely:

    _eng_word = r"[a-zA-Z][a-zA-Z0-9'.]*"
    
    0 讨论(0)
  • 2021-02-07 10:08

    http://bugs.python.org/issue15096

    Title: Drop support for the "ur" string prefix
    When PEP 414 restored support for explicit Unicode literals in Python 3, the "ur" string prefix was deemed to be a synonym for the "r" prefix.

    So, use 'r' instead of 'ur'

    0 讨论(0)
提交回复
热议问题