I want to retrieve a legacy xml file, manipulate and save it.
Here is my code:
from xml.etree import cElementTree as ET
NS = \"{http://www.somedomain
Have a look at the lxml tutorial section on namespaces. Also this article about namespaces in ElementTree.
Problem 1: Put up with it, like everybody else does. Instead of "%(ns)Event" % {'ns':NS }
try NS+"Event"
.
Problem 2: By default, the XML declaration is written only if it is required. You can force it (lxml only) by using xml_declaration=True
in your write()
call.
Problem 3: The nsmap
arg appears to be lxml-only. AFAICT it needs a MAPping, not a string. Try nsmap={None: NS}
. The effbot article has a section describing a workaround for this.
To answer your questions in order:
you can't just ignore the namespace, not in the path syntax that .findall()
uses , but not in "real" xpath (supported by lxml) either: there you'd still be forced to use a prefix, and still need to provide some prefix-to-uri mapping.
use xml_declaration=True
as well as encoding='utf-8'
with the .write()
call (available in lxml, but in stdlib xml.etree only since python 2.7 I believe)
I believe lxml will do behave like you want