Decode HTML entities in Python string?

后端 未结 6 913
名媛妹妹
名媛妹妹 2020-11-21 06:18

I\'m parsing some HTML with Beautiful Soup 3, but it contains HTML entities which Beautiful Soup 3 doesn\'t automatically decode for me:

>>> from Be         


        
6条回答
  •  被撕碎了的回忆
    2020-11-21 06:40

    You can use replace_entities from w3lib.html library

    In [202]: from w3lib.html import replace_entities
    
    In [203]: replace_entities("£682m")
    Out[203]: u'\xa3682m'
    
    In [204]: print replace_entities("£682m")
    £682m
    

提交回复
热议问题