I\'m trying to parse, manipulate, and output HTML using Python\'s ElementTree:
import sys
from cStringIO import StringIO
from xml.etree import ElementTree as E
XML only defines <
, >
, '
, "
and &
.
and others come from HTML. So you have a couple of choices.
or
both of which are equivalent to
.There is some useful information (it is written about XSLT, but XSLT is written using XML, so the same applies) at the XSLT FAQ.
The question appears now to include a stack trace; that changes things. Are you sure that the string is in UTF-8
? If it resolves to the single byte 0xA0
, then it isn't UTF-8
but more likely cp1252
or iso-8859-1
.