Python regex replacing \u2022

前端 未结 4 1168
死守一世寂寞
死守一世寂寞 2021-01-26 09:53

This is my string:

raw_list = u\'Software Engineer with a huge passion for new and innovative products. Experienced gained from working in both big and fast-grow         


        
4条回答
  •  梦毁少年i
    2021-01-26 10:20

    The key is to add the unicode u in front of the unicode character that you're trying to find - in this case the \u2022 which is the unicode character for a bullet. If your text contains unicode characters then your text is actually unicode text as opposed to a string (you can confirm by printing out your text and looking for the u at the beginning). See the below example, where I search for a unicode bullet character using regular expressions (RegEx) on both a string and unicode text:

    import regular expressions package:
    import re
    
    unicode text:
    my_unicode = u"""\u2022 Here\'s a string of data.\n
    \u2022 There are new line characters \n, HTML line break tags
    , and bullets \u2002 together in a sequence.\n
    \u2022 Our goal is to use RegEx to identify the sequences.""" type(my_unicode) #unicode
    string:
    my_string = """\u2022 Here\'s a string of data. \n
    \u2022There are new line characters \n, HTML line break tags
    , and bullets \u2002 together in a sequence.\n
    \u2022 Our goal is to use RegEx to identify the sequences.""" type(my_string) #string
    we successfully find the first piece of text that we're looking for which doesn't yet contain the unicode characters:
    re.findall('\n
    ', my_unicode) re.findall('\n
    ', my_string)
    with the addition of the unicode character, neither substring can be found:
    re.findall('\n
    \u2022', my_unicode) re.findall('\n
    \u2022', my_string)
    Adding four backslashes works for the string, but it does not work for the unicode text:
    re.findall('\n
    \\\\u', my_unicode) re.findall('\n
    \\\\u', my_string)
    Solution: Include the unicode u in front of the unicode character:
    re.findall('\n
    ' u'\u2022', my_unicode)

提交回复
热议问题