minidom

MemoryError with minidom in Python

你离开我真会死。 提交于 2019-12-06 05:58:30
I've got a MemoryError with minidom parser in Python. I'm reading 8000 small files (most under 50 Kb) and I've got this error after 2500 reading...` Traceback (most recent call last): File "C:\eclipse\plugins\org.python.pydev.debug_2.4.0.2012020116\pysrc\pydevd.py", line 1307, in <module> debugger.run(setup['file'], None, None) File "C:\eclipse\plugins\org.python.pydev.debug_2.4.0.2012020116\pysrc\pydevd.py", line 1060, in run pydev_imports.execfile(file, globals, locals) #execute the script File "C:\Users\calculator_2012.py", line 81, in <module> file_content, economicFlow, elementaryFlow =

Walk through all XML nodes in an element-nested structure

二次信任 提交于 2019-12-06 04:00:57
问题 I have this kind of XML structure (output from the Esprima ASL converted from JSON), it can get even more nested than this ( ASL.xml ): <?xml version="1.0" encoding="UTF-8" ?> <program> <type>Program</type> <body> <type>VariableDeclaration</type> <declarations> <type>VariableDeclarator</type> <id> <type>Identifier</type> <name>answer</name> </id> <init> <type>BinaryExpression</type> <operator>*</operator> <left> <type>Literal</type> <value>6</value> </left> <right> <type>Literal</type> <value

Get node name with minidom

感情迁移 提交于 2019-12-05 03:05:52
Is it possible to get the name of a node using minidom? For example I have a node: <heading><![CDATA[5 year]]></heading> What I'm trying to do, is store the value heading so that I can use it as a key in a dictionary. The closest I can get is something like: [<DOM Element: heading at 0x11e6d28>] I'm sure I'm overlooking something very simple here, thanks. Is this what you mean? tag= node.tagName d[tag]= node tagName is defined in DOM Level 1 Core, the basic standard that minidom (mostly) implements. 来源: https://stackoverflow.com/questions/2795462/get-node-name-with-minidom

How to find elements by 'id' field in SVG file using Python

大憨熊 提交于 2019-12-05 02:32:45
问题 Below is an excerpt from an .svg file (which is xml): <text xml:space="preserve" style="font-size:14.19380379px;font-style:normal;font-variant:normal;font-weight:normal;font-stretch:normal;text-align:start;line-height:125%;writing-mode:lr-tb;text-anchor:start;fill:#000000;fill-opacity:1;stroke:none;font-family:DejaVu Sans Mono;-inkscape-font-specification:DejaVu Sans Mono" x="109.38555" y="407.02847" id="libcode-00" sodipodi:linespacing="125%" inkscape:label="#text4638"><tspan sodipodi:role=

Ignoring XML errors in Python

て烟熏妆下的殇ゞ 提交于 2019-12-04 23:45:07
问题 I am using XML minidom (xml.dom.minidom) in Python, but any error in the XML will kill the parser. Is it possible to ignore them, like a browser for example? I am trying to write a browser in Python, but it just throws an exception if the tags aren't fully compatible. 回答1: There is a library called BeautifulSoup, I think it's what you're looking for. As you're trying to parse a invalid XML, the normal XML parser won't work. BeautifulSoup is more fail-tolerant, it can still extract information

All nodeValue fields are None when parsing XML

◇◆丶佛笑我妖孽 提交于 2019-12-04 21:56:50
问题 I'm building a simple web-based RSS reader in Python, but I'm having trouble parsing the XML. I started out by trying some stuff in the Python command line. >>> from xml.dom import minidom >>> import urllib2 >>> url ='http://www.digg.com/rss/index.xml' >>> xmldoc = minidom.parse(urllib2.urlopen(url)) >>> channelnode = xmldoc.getElementsByTagName("channel") >>> channelnode = xmldoc.getElementsByTagName("channel") >>> titlenode = channelnode[0].getElementsByTagName("title") >>> print titlenode

How to set element's id in Python's xml.dom.minidom?

旧时模样 提交于 2019-12-04 14:40:56
问题 How to? Created a document and an element: import xml.dom.minidom as d a=d.Document() b=a.createElement('test') setIdAttribute doesn't work :( b.setIdAttribute('something') Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/usr/lib/python2.6/xml/dom/minidom.py", line 835, in setIdAttribute self.setIdAttributeNode(idAttr) File "/usr/lib/python2.6/xml/dom/minidom.py", line 843, in setIdAttributeNode raise xml.dom.NotFoundErr() xml.dom.NotFoundErr And if I set this by

How to add an xml-stylesheet processing instruction node with Python 2.6 and minidom?

ⅰ亾dé卋堺 提交于 2019-12-04 01:32:51
问题 I'm creating an XML document using minidom - how do I ensure my resultant XML document contains a stylesheet reference like this: <?xml-stylesheet type="text/xsl" href="mystyle.xslt"?> Thanks ! 回答1: Use something like this: from xml.dom import minidom xml = """ <root> <x>text</x> </root>""" dom = minidom.parseString(xml) pi = dom.createProcessingInstruction('xml-stylesheet', 'type="text/xsl" href="mystyle.xslt"') root = dom.firstChild dom.insertBefore(pi, root) print dom.toprettyxml() => <

How to find elements by 'id' field in SVG file using Python

故事扮演 提交于 2019-12-03 20:42:45
Below is an excerpt from an .svg file (which is xml): <text xml:space="preserve" style="font-size:14.19380379px;font-style:normal;font-variant:normal;font-weight:normal;font-stretch:normal;text-align:start;line-height:125%;writing-mode:lr-tb;text-anchor:start;fill:#000000;fill-opacity:1;stroke:none;font-family:DejaVu Sans Mono;-inkscape-font-specification:DejaVu Sans Mono" x="109.38555" y="407.02847" id="libcode-00" sodipodi:linespacing="125%" inkscape:label="#text4638"><tspan sodipodi:role="line" id="tspan4640" x="109.38555" y="407.02847">12345678</tspan></text> I'm learning Python and have

Ignoring XML errors in Python

对着背影说爱祢 提交于 2019-12-03 14:15:42
I am using XML minidom (xml.dom.minidom) in Python, but any error in the XML will kill the parser. Is it possible to ignore them, like a browser for example? I am trying to write a browser in Python, but it just throws an exception if the tags aren't fully compatible. There is a library called BeautifulSoup , I think it's what you're looking for. As you're trying to parse a invalid XML, the normal XML parser won't work. BeautifulSoup is more fail-tolerant, it can still extract information from invalid XML. Beautiful Soup is a Python HTML/XML parser designed for quick turnaround projects like