问题
I have a xml like this:
<?xml version="1.0" encoding="UTF-8"?>
<ns0:epp xmlns:ns0="urn:ietf:params:xml:ns:epp-1.0"
xmlns:ns1="http://epp.nic.ir/ns/contact-1.0">
<ns0:command>
<ns0:check>
<ns1:check>
<ns1:id>ex61-irnic</ns1:id>
<ns1:id>ex999-irnic</ns1:id>
<ns1:authInfo>
<ns1:pw>1487441516170712</ns1:pw>
</ns1:authInfo>
</ns1:check>
</ns0:check>
<ns0:clTRID>TEST-12345</ns0:clTRID>
</ns0:command>
</ns0:epp>
I want to change it with python 3 to be like this:
<?xml version="1.0" encoding="UTF-8"?>
<epp xmlns="urn:ietf:params:xml:ns:epp-1.0">
<command>
<check>
<check>
<id>ex61-irnic</id>
<id>ex999-irnic</id>
<authInfo>
<pw>1487441516170712</pw>
</authInfo>
</check>
</check>
<clTRID>TEST-12345</clTRID>
</command>
</epp>
i tried to remove ns with objectify.deannotate from lxml module. but it didn't work. could you please help me to reach my aim ?
回答1:
This is a combination of Remove namespace and prefix from xml in python using lxml, which shows how to modify the namespace of an element, and lxml: add namespace to input file, which shows how to reset the top namespace map.
The code is a little hacky (I'm particularly suspicious of whether or not it's kosher to use the _setroot
method), but it seems to work:
from lxml import etree
inputfile = 'data.xml'
target_ns = 'urn:ietf:params:xml:ns:epp-1.0'
nsmap = {None: target_ns}
tree = etree.parse(inputfile)
root = tree.getroot()
# here we set the namespace of all elements to target_ns
for elem in root.getiterator():
tag = etree.QName(elem.tag)
elem.tag = '{%s}%s' % (target_ns, tag.localname)
# create a new root element and set the namespace map, then
# copy over all the child elements
new_root = etree.Element(root.tag, nsmap=nsmap)
new_root[:] = root[:]
# create a new elementtree with new_root so that we can use the
# .write method.
tree = etree.ElementTree()
tree._setroot(new_root)
tree.write('done.xml',
pretty_print=True, xml_declaration=True, encoding='UTF-8')
Given your sample input, this produces in done.xml
:
<?xml version='1.0' encoding='UTF-8'?>
<epp xmlns="urn:ietf:params:xml:ns:epp-1.0"><command>
<check>
<check>
<id>ex61-irnic</id>
<id>ex999-irnic</id>
<authInfo>
<pw>1487441516170712</pw>
</authInfo>
</check>
</check>
<clTRID>TEST-12345</clTRID>
</command>
</epp>
回答2:
Consider XSLT, the special-purpose language designed to transform XML files such as removing namespaces. Python's third-party module, lxml
, can run XSLT 1.0 scripts. And because XSLT scripts are XML files, you can parse from file or string like any XML. No loops or conditional if
logic needed. Additionally, you can use this XSLT script in other languages (PHP, Java, C#, etc.)
XSLT (save as .xsl file to be referenced in Python)
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output indent="yes"/>
<xsl:strip-space elements="*"/>
<!-- IDENTITY TRANSFROM: COPY DOC AS IS -->
<xsl:template match="@*|node()">
<xsl:copy>
<xsl:apply-templates select="@*|node()"/>
</xsl:copy>
</xsl:template>
<!-- REMOVE NAMESPACE PREFIXES, ADD DOC NAMESPACE -->
<xsl:template match="*">
<xsl:element name="{local-name()}" namespace="urn:ietf:params:xml:ns:epp-1.0">
<xsl:apply-templates select="@*|node()"/>
</xsl:element>
</xsl:template>
</xsl:stylesheet>
Python
import lxml.etree as et
# LOAD XML AND XSL
doc = et.parse('Input.xml')
xsl = et.parse('XSLT_Script.xsl')
# CONFIGURE AND RUN TRANSFORMER
transform = et.XSLT(xsl)
result = transform(doc)
# OUTPUT RESULT TREE TO FILE
with open('Output.xml', 'wb') as f:
f.write(result)
Output
<?xml version="1.0"?>
<epp xmlns="urn:ietf:params:xml:ns:epp-1.0">
<command>
<check>
<check>
<id>ex61-irnic</id>
<id>ex999-irnic</id>
<authInfo>
<pw>1487441516170712</pw>
</authInfo>
</check>
</check>
<clTRID>TEST-12345</clTRID>
</command>
</epp>
来源:https://stackoverflow.com/questions/45817239/how-can-i-remove-ns-from-xml-in-python