How do i get rid of all the smart quotes while parsing a web page?

喜你入骨 提交于 2019-12-24 02:23:58

问题


This is my code :

name = namestr.decode("utf-8")

name.replace(u"\u2018", "").replace(u"\u2019", "").replace(u"\u201c","").replace(u"\u201d", "")

This doesn't seem to work . I still find &ldquo , &rdquo etc in my text. Also this text has been parsed using beautiful soup


回答1:


Replace the last line of your code with this one:

name = name.replace(u"\u2018", "").replace(u"\u2019", "").replace(u"\u201c","").replace(u"\u201d", "")

The replace method returns a modified string but it does not affect the sting you call it on so you have to assign the return value to the variable as above.



来源:https://stackoverflow.com/questions/15751636/how-do-i-get-rid-of-all-the-smart-quotes-while-parsing-a-web-page

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!