lxml: add namespace to input file

后端 未结 6 1480
不思量自难忘°
不思量自难忘° 2020-12-03 21:16

I am parsing an xml file generated by an external program. I would then like to add custom annotations to this file, using my own namespace. My input looks as below:

相关标签:
6条回答
  • 2020-12-03 21:31

    You could replace the root element to add 'kjw' to its nsmap. Then xmlns declaration would be only in the root element.

    0 讨论(0)
  • 2020-12-03 21:33

    Modifying the namespace mapping of a node is not possible in lxml. See this open ticket that has this feature as a wishlist item.

    It originated from this thread on the lxml mailing list, where a workaround replacing the root node is given as an alternative. There are some issues with replacing the root node though: see the ticket above.

    I'll put the suggested root replacement workaround code here for completeness:

    >>> DOC = """<sbml xmlns="http://www.sbml.org/sbml/level2/version4" xmlns:celldesigner="http://www.sbml.org/2001/ns/celldesigner" level="2" version="4">
    ...   <model metaid="untitled" id="untitled">
    ...     <annotation>...</annotation>
    ...     <listOfUnitDefinitions>...</listOfUnitDefinitions>
    ...     <listOfCompartments>...</listOfCompartments>
    ...     <listOfSpecies>
    ...       <species metaid="s1" id="s1" name="GenA" compartment="default" initialAmount="0">
    ...         <annotation>
    ...           <celldesigner:extension>...</celldesigner:extension>
    ...         </annotation>
    ...       </species>
    ...       <species metaid="s2" id="s2" name="s2" compartment="default" initialAmount="0">
    ...         <annotation>
    ...            <celldesigner:extension>...</celldesigner:extension>
    ...         </annotation>
    ...       </species>
    ...     </listOfSpecies>
    ...     <listOfReactions>...</listOfReactions>
    ...   </model>
    ... </sbml>"""
    >>> 
    >>> from lxml import etree
    >>> from StringIO import StringIO
    >>> NS = "http://this.is.some/custom_namespace"
    >>> tree = etree.ElementTree(element=None, file=StringIO(DOC))
    >>> root = tree.getroot()
    >>> nsmap = root.nsmap
    >>> nsmap['kjw'] = NS
    >>> new_root = etree.Element(root.tag, nsmap=nsmap)
    >>> new_root[:] = root[:]
    >>> new_root.append(etree.Element('{%s}%s' % (NS, 'test')))
    >>> new_root.append(etree.Element('{%s}%s' % (NS, 'test')))
    
    >>> print etree.tostring(new_root, pretty_print=True)
    <sbml xmlns:celldesigner="http://www.sbml.org/2001/ns/celldesigner" xmlns:kjw="http://this.is.some/custom_namespace" xmlns="http://www.sbml.org/sbml/level2/version4"><model metaid="untitled" id="untitled">
        <annotation>...</annotation>
        <listOfUnitDefinitions>...</listOfUnitDefinitions>
        <listOfCompartments>...</listOfCompartments>
        <listOfSpecies>
          <species metaid="s1" id="s1" name="GenA" compartment="default" initialAmount="0">
            <annotation>
              <celldesigner:extension>...</celldesigner:extension>
            </annotation>
          </species>
          <species metaid="s2" id="s2" name="s2" compartment="default" initialAmount="0">
            <annotation>
               <celldesigner:extension>...</celldesigner:extension>
            </annotation>
          </species>
        </listOfSpecies>
        <listOfReactions>...</listOfReactions>
      </model>
    <kjw:test/><kjw:test/></sbml>
    
    0 讨论(0)
  • 2020-12-03 21:36

    I know this is old question, but it still valid and as of lxml 3.5.0, there is probably better solution to this problem:

    cleanup_namespaces() accepts a new argument top_nsmap that moves definitions of the provided prefix-namespace mapping to the top of the tree.

    So now the namespace map can be moved up with simple call to this:

    nsmap = {'kjw': 'http://this.is.some/custom_namespace'}
    etree.cleanup_namespaces(root, top_nsmap=nsmap)
    
    0 讨论(0)
  • 2020-12-03 21:38

    Rather than dealing directly with the raw XML you could also look toward LibSBML, a library for manipulating SBML documents with language bindings for, among others, python. There you would use it like this:

    >>> from libsbml import *
    >>> doc = readSBML('Dropbox/SBML Models/BorisEJB.xml')
    >>> species = doc.getModel().getSpecies('MAPK')
    >>> species.appendAnnotation('<kjw:test xmlns:kjw="http://this.is.some/custom_namespace"/>')
    0
    >>> species.toSBML()
    '<species id="MAPK" compartment="compartment" initialConcentration="280" boundaryCondition="false">\n  <annotation>\n
     <kjw:test xmlns:kjw="http://this.is.some/custom_namespace"/>\n  </annotation>\n</species>'
    >>>
    
    
    0 讨论(0)
  • 2020-12-03 21:47

    If you temporarily add a namespaced attribute to the root node, that does the trick.

    ns = '{http://this.is.some/custom_namespace}'
    
    # add 'kjw:foobar' attribute to root node
    root.set(ns+'foobar', 'foobar')
    
    # add kjw namespace elements (or attributes) elsewhere
    ... get child element species ...
    species.append(etree.Element(ns + 'test'))
    
    # remove temporary namespaced attribute from root node
    del root.attrib[ns+'foobar']
    
    0 讨论(0)
  • 2020-12-03 21:53

    I wrote this function to add a namespace to the root element:

    def addns(tree, alias, uri):                
        root = tree.getroot()
        nsmap = root.nsmap
        nsmap[alias] = uri
        new_root = etree.Element(root.tag, attrib=root.attrib, nsmap=nsmap)
        new_root[:] = root[:]
        return new_root.getroottree()
    

    After applying this function, you get a new tree, but you can probably change the tree instance from the single objet from which you access the tree ... as you have a strong OO design!.

    0 讨论(0)
提交回复
热议问题