I have the following XML which I want to parse using Python\'s ElementTree
:
Note: This is an answer useful for Python's ElementTree standard library without using hardcoded namespaces.
To extract namespace's prefixes and URI from XML data you can use ElementTree.iterparse
function, parsing only namespace start events (start-ns):
>>> from io import StringIO
>>> from xml.etree import ElementTree
>>> my_schema = u'''
...
...
... basketball league
...
... a group of sports teams that compete against each other
... in Basketball
...
...
...
... '''
>>> my_namespaces = dict([
... node for _, node in ElementTree.iterparse(
... StringIO(my_schema), events=['start-ns']
... )
... ])
>>> from pprint import pprint
>>> pprint(my_namespaces)
{'': 'http://dbpedia.org/ontology/',
'owl': 'http://www.w3.org/2002/07/owl#',
'rdf': 'http://www.w3.org/1999/02/22-rdf-syntax-ns#',
'rdfs': 'http://www.w3.org/2000/01/rdf-schema#',
'xsd': 'http://www.w3.org/2001/XMLSchema#'}
Then the dictionary can be passed as argument to the search functions:
root.findall('owl:Class', my_namespaces)