I am parsing an xml file generated by an external program. I would then like to add custom annotations to this file, using my own namespace. My input looks as below:
You could replace the root element to add 'kjw' to its nsmap. Then xmlns declaration would be only in the root element.
Modifying the namespace mapping of a node is not possible in lxml. See this open ticket that has this feature as a wishlist item.
It originated from this thread on the lxml mailing list, where a workaround replacing the root node is given as an alternative. There are some issues with replacing the root node though: see the ticket above.
I'll put the suggested root replacement workaround code here for completeness:
>>> DOC = """<sbml xmlns="http://www.sbml.org/sbml/level2/version4" xmlns:celldesigner="http://www.sbml.org/2001/ns/celldesigner" level="2" version="4">
... <model metaid="untitled" id="untitled">
... <annotation>...</annotation>
... <listOfUnitDefinitions>...</listOfUnitDefinitions>
... <listOfCompartments>...</listOfCompartments>
... <listOfSpecies>
... <species metaid="s1" id="s1" name="GenA" compartment="default" initialAmount="0">
... <annotation>
... <celldesigner:extension>...</celldesigner:extension>
... </annotation>
... </species>
... <species metaid="s2" id="s2" name="s2" compartment="default" initialAmount="0">
... <annotation>
... <celldesigner:extension>...</celldesigner:extension>
... </annotation>
... </species>
... </listOfSpecies>
... <listOfReactions>...</listOfReactions>
... </model>
... </sbml>"""
>>>
>>> from lxml import etree
>>> from StringIO import StringIO
>>> NS = "http://this.is.some/custom_namespace"
>>> tree = etree.ElementTree(element=None, file=StringIO(DOC))
>>> root = tree.getroot()
>>> nsmap = root.nsmap
>>> nsmap['kjw'] = NS
>>> new_root = etree.Element(root.tag, nsmap=nsmap)
>>> new_root[:] = root[:]
>>> new_root.append(etree.Element('{%s}%s' % (NS, 'test')))
>>> new_root.append(etree.Element('{%s}%s' % (NS, 'test')))
>>> print etree.tostring(new_root, pretty_print=True)
<sbml xmlns:celldesigner="http://www.sbml.org/2001/ns/celldesigner" xmlns:kjw="http://this.is.some/custom_namespace" xmlns="http://www.sbml.org/sbml/level2/version4"><model metaid="untitled" id="untitled">
<annotation>...</annotation>
<listOfUnitDefinitions>...</listOfUnitDefinitions>
<listOfCompartments>...</listOfCompartments>
<listOfSpecies>
<species metaid="s1" id="s1" name="GenA" compartment="default" initialAmount="0">
<annotation>
<celldesigner:extension>...</celldesigner:extension>
</annotation>
</species>
<species metaid="s2" id="s2" name="s2" compartment="default" initialAmount="0">
<annotation>
<celldesigner:extension>...</celldesigner:extension>
</annotation>
</species>
</listOfSpecies>
<listOfReactions>...</listOfReactions>
</model>
<kjw:test/><kjw:test/></sbml>
I know this is old question, but it still valid and as of lxml 3.5.0, there is probably better solution to this problem:
cleanup_namespaces()
accepts a new argumenttop_nsmap
that moves definitions of the provided prefix-namespace mapping to the top of the tree.
So now the namespace map can be moved up with simple call to this:
nsmap = {'kjw': 'http://this.is.some/custom_namespace'}
etree.cleanup_namespaces(root, top_nsmap=nsmap)
Rather than dealing directly with the raw XML you could also look toward LibSBML, a library for manipulating SBML documents with language bindings for, among others, python. There you would use it like this:
>>> from libsbml import * >>> doc = readSBML('Dropbox/SBML Models/BorisEJB.xml') >>> species = doc.getModel().getSpecies('MAPK') >>> species.appendAnnotation('<kjw:test xmlns:kjw="http://this.is.some/custom_namespace"/>') 0 >>> species.toSBML() '<species id="MAPK" compartment="compartment" initialConcentration="280" boundaryCondition="false">\n <annotation>\n <kjw:test xmlns:kjw="http://this.is.some/custom_namespace"/>\n </annotation>\n</species>' >>>
If you temporarily add a namespaced attribute to the root node, that does the trick.
ns = '{http://this.is.some/custom_namespace}'
# add 'kjw:foobar' attribute to root node
root.set(ns+'foobar', 'foobar')
# add kjw namespace elements (or attributes) elsewhere
... get child element species ...
species.append(etree.Element(ns + 'test'))
# remove temporary namespaced attribute from root node
del root.attrib[ns+'foobar']
I wrote this function to add a namespace to the root element:
def addns(tree, alias, uri):
root = tree.getroot()
nsmap = root.nsmap
nsmap[alias] = uri
new_root = etree.Element(root.tag, attrib=root.attrib, nsmap=nsmap)
new_root[:] = root[:]
return new_root.getroottree()
After applying this function, you get a new tree, but you can probably change the tree instance from the single objet from which you access the tree ... as you have a strong OO design!.