Extracting text after tag in Python's ElementTree

后端未结

关注

 1  606

Here is a part of XML:

 Picture of a cat

Extracting the tag is easy. Just do:

相关标签:

1条回答

我寻月下人不归

2020-12-03 09:13

Elements have a tail attribute -- so instead of element.text, you're asking for element.tail.

>>> import lxml.etree
>>> root = lxml.etree.fromstring('''<root><foo>bar</foo>baz</root>''')
>>> root[0]
<Element foo at 0x145a3c0>
>>> root[0].tail
'baz'

Or, for your example:

>>> et = lxml.etree.fromstring('''<item><img src="cat.jpg" /> Picture of a cat</item>''')
>>> et.find('img').tail
' Picture of a cat'

This also works with plain ElementTree:

>>> import xml.etree.ElementTree
>>> xml.etree.ElementTree.fromstring(
...   '''<item><img src="cat.jpg" /> Picture of a cat</item>'''
... ).find('img').tail
' Picture of a cat'

0 讨论(0)