I need to convert a web page to XML (using Python 3.4.3
). If I write the contents of the URL to a file then I can read and parse it perfectly but if I try to re
As explained in the Parsing XML section of the ElementTree
docs:
We can import this data by reading from a file:
import xml.etree.ElementTree as ET
tree = ET.parse('country_data.xml')
root = tree.getroot()
Or directly from a string:
root = ET.fromstring(country_data_as_string)
You're passing the whole XML contents as a giant pathname. Your XML file is probably bigger than 2K, or whatever the maximum pathname size is for your platform, hence the error. If it weren't, you'd just get a different error about there being no directory named [everything up to the first / in your XML file]
.
Just use fromstring instead of parse
.
Or, notice that parse can take a file object, not just a filename. And the thing returned by urlopen is a file object.
Also notice the very next line in that section:
fromstring()
parses XML from a string directly into anElement
, which is the root element of the parsed tree. Other parsing functions may create anElementTree
.
So, you don't want that root = tree.getroot()
either.
So:
# ...
content.close()
root = ElementTree.fromstring(xmlData)