Can the [a-zA-Z] Python regex pattern be made to match and replace non-ASCII Unicode characters?

后端 未结 1 550
面向向阳花
面向向阳花 2020-12-21 00:42

In the following regular expression, I would like each character in the string replaced with an \'X\', but it isn\'t working.

In Python 2.7:

>>         


        
1条回答
  •  生来不讨喜
    2020-12-21 01:15

    You may use

    (?![\d_])\w
    

    With the Unicode modifier. The (?![\d_]) look-ahead is restricting the \w shorthand class so as it could not match any digits (\d) or underscores.

    See regex demo

    A Python 3 demo:

    import re
    print (re.sub(r"(?![\d_])\w","X","dfäg"))
    # => XXXX
    

    As for Python 2:

    # -*- coding: utf-8 -*-
    import re
    s = "dfäg"
    w = re.sub(ur'(?![\d_])\w', u'X', s.decode('utf8'), 0, re.UNICODE).encode("utf8")
    print(w)
    

    0 讨论(0)
提交回复
热议问题