lxml and xml namespaces - Using find and findall to get XML Tag Value

冷暖自知 提交于 2021-01-28 05:31:02

问题


I had issues in getting the text value of and nodes using lxml where the XML text has namespaces in it. I was using findall('Status') but the result was always coming to null.

I arrived at the following working code in the end....Is this the correct way of using lxml for fetching node values? Can i improve this further?

import lxml
xml_string='<?xml version="1.0" encoding="UTF-8"?> <SCPP:Response xmlns:SCPP="http://www.SCPP.com/XMLSchema"> <SCPP:RESP_BODY> <Seed>001335834994</Seed> </SCPP:RESP_BODY> <SCPP:RESP_HDR> <Status>00</Status> </SCPP:RESP_HDR> </SCPP:Response>'
root = etree.fromstring(xml_string)
nsmap = {}
for ns in root.xpath('//namespace::*'):
    if ns[0]:
            nsmap[ns[0]] = ns[1]

#Method 1
print 'Status is ' , root.xpath('//SCPP:RESP_HDR', namespaces=nsmap)[0].find('Status').text
print 'Seed is ' , root.xpath('//SCPP:RESP_BODY', namespaces=nsmap)[0].find('Seed').text

#Method 2
print 'Status is ' , root.findall('SCPP:RESP_HDR',namespaces=nsmap)[0].find('Status').text
print 'Seed is ' , root.findall('SCPP:RESP_BODY',namespaces=nsmap)[0].find('Seed').text

#Method 3   
print 'Status is ' , root.xpath('//SCPP:RESP_HDR', namespaces=nsmap)[0].find('Status').text
print 'Seed is ' , root.find('SCPP:RESP_BODY',namespaces=nsmap).find('Seed').text

回答1:


You don't need to build nsmap manually.

Replace following lines:

nsmap = {}
for ns in root.xpath('//namespace::*'):
    if ns[0]:
            nsmap[ns[0]] = ns[1]

with:

nsmap = root.nsmap

Another way to get text of specific element (using xpath):

>>> root.xpath('.//SCPP:RESP_HDR/Status/text()', namespaces=nsmap)[0]
'00'
>>> root.xpath('.//SCPP:RESP_BODY/Seed/text()',namespaces=nsmap)[0]
'001335834994'


来源:https://stackoverflow.com/questions/21569457/lxml-and-xml-namespaces-using-find-and-findall-to-get-xml-tag-value

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!