I\'m parsing some HTML with Beautiful Soup 3, but it contains HTML entities which Beautiful Soup 3 doesn\'t automatically decode for me:
>>> from Be
Beautiful Soup handles entity conversion. In Beautiful Soup 3, you'll need to specify the convertEntities
argument to the BeautifulSoup
constructor (see the 'Entity Conversion' section of the archived docs). In Beautiful Soup 4, entities get decoded automatically.
>>> from BeautifulSoup import BeautifulSoup
>>> BeautifulSoup("£682m
",
... convertEntities=BeautifulSoup.HTML_ENTITIES)
£682m
>>> from bs4 import BeautifulSoup
>>> BeautifulSoup("£682m
")
£682m