Using unicode (Hebrew characters) with regular expression

前端 未结 1 1354
佛祖请我去吃肉
佛祖请我去吃肉 2021-01-20 23:55

I wrote script that find expressions in web page:

import sre, urllib2, sys, BaseHTTPServer
# -*- coding: utf-8 -*-    
address = sys.argv[1]
web_handle = urllib2         


        
相关标签:
1条回答
  • 2021-01-21 00:16

    You need to ensure that the input string is also in UTF8 format.

    Use unicode function with utf-8 as second argument:

    website_text = unicode(website_text, "utf-8")
    

    Everything should be in consistent encoding for unicode to work in Python 2.

    0 讨论(0)
提交回复
热议问题