问题
This is my code :
name = namestr.decode("utf-8")
name.replace(u"\u2018", "").replace(u"\u2019", "").replace(u"\u201c","").replace(u"\u201d", "")
This doesn't seem to work . I still find &ldquo
, &rdquo
etc in my text. Also this text has been parsed using beautiful soup
回答1:
Replace the last line of your code with this one:
name = name.replace(u"\u2018", "").replace(u"\u2019", "").replace(u"\u201c","").replace(u"\u201d", "")
The replace
method returns a modified string but it does not affect the sting you call it on so you have to assign the return value to the variable as above.
来源:https://stackoverflow.com/questions/15751636/how-do-i-get-rid-of-all-the-smart-quotes-while-parsing-a-web-page