ElementTree in Python 2.6.2 Processing Instructions support?

后端 未结 5 691
隐瞒了意图╮
隐瞒了意图╮ 2020-12-11 06:12

I\'m trying to create XML using the ElementTree object structure in python. It all works very well except when it comes to processing instructions. I can create a PI easil

相关标签:
5条回答
  • 2020-12-11 06:14

    With the lxml API it couldn't be easier, though it is a bit "underdocumented":

    If you need a top-level processing instruction, create it like this:

    from lxml import etree
    
    root = etree.Element("anytagname")
    root.addprevious(etree.ProcessingInstruction("anypi", "anypicontent"))
    

    The resulting document will look like this:

    <?anypi anypicontent?>
    <anytagname />
    

    They certainly should add this to their FAQ because IMO it is another feature that sets this fine API apart.

    0 讨论(0)
  • 2020-12-11 06:22

    Try the lxml library: it follows the ElementTree api, plus adds a lot of extras. From the compatibility overview:

    ElementTree ignores comments and processing instructions when parsing XML, while etree will read them in and treat them as Comment or ProcessingInstruction elements respectively. This is especially visible where comments are found inside text content, which is then split by the Comment element.

    You can disable this behaviour by passing the boolean remove_comments and/or remove_pis keyword arguments to the parser you use. For convenience and to support portable code, you can also use the etree.ETCompatXMLParser instead of the default etree.XMLParser. It tries to provide a default setup that is as close to the ElementTree parser as possible.

    Not in the stdlib, I know, but in my experience the best bet when you need stuff that the standard ElementTree doesn't provide.

    0 讨论(0)
  • 2020-12-11 06:23

    I don't know much about ElementTree. But it is possible that you might be able to solve your problem using a library I wrote called "xe".

    xe is a set of Python classes designed to make it easy to create structured XML. I haven't worked on it in a long time, for various reasons, but I'd be willing to help you if you have questions about it, or need bugs fixed.

    It has the bare bones of support for things like processing instructions, and with a little bit of work I think it could do what you need. (When I started adding processing instructions, I didn't really understand them, and I didn't have any need for them, so the code is sort of half-baked.)

    Take a look and see if it seems useful.

    http://home.avvanta.com/~steveha/xe.html

    Here's an example of using it:

    import xe
    doc = xe.XMLDoc()
    
    prefs = xe.NestElement("prefs")
    prefs.user_name = xe.TextElement("user_name")
    prefs.paper = xe.NestElement("paper")
    prefs.paper.width = xe.IntElement("width")
    prefs.paper.height = xe.IntElement("height")
    
    doc.root_element = prefs
    
    
    prefs.user_name = "John Doe"
    prefs.paper.width = 8
    prefs.paper.height = 10
    
    c = xe.Comment("this is a comment")
    doc.top.append(c)
    

    If you ran the above code and then ran print doc here is what you would get:

    <?xml version="1.0" encoding="utf-8"?>
    <!-- this is a comment -->
    <prefs>
        <user_name>John Doe</user_name>
        <paper>
            <width>8</width>
            <height>10</height>
        </paper>
    </prefs>
    

    If you are interested in this but need some help, just let me know.

    Good luck with your project.

    0 讨论(0)
  • 2020-12-11 06:38
    f = open('D:\Python\XML\test.xml', 'r+')
    old = f.read()
    f.seek(44,0)      #place cursor after xml declaration
    f.write('<?xml-stylesheet type="text/xsl" href="C:\Stylesheets\expand.xsl"?>'+ old[44:])
    

    I was facing the same problem and came up with this crude solution after failing to insert the PI into the .xml file correctly even after using one of the Element methods in my case root.insert (0, PI) and trying multiple ways to cut and paste the inserted PI to the correct location only to find the data to be deleted from unexpected locations.

    0 讨论(0)
  • 2020-12-11 06:39

    Yeah, I don't believe it's possible, sorry. ElementTree provides a simpler interface to (non-namespaced) element-centric XML processing than DOM, but the price for that is that it doesn't support the whole XML infoset.

    There is no apparent way to represent the content that lives outside the root element (comments, PIs, the doctype and the XML declaration), and these are also discarded at parse time. (Aside: this appears to include any default attributes specified in the DTD internal subset, which makes ElementTree strictly-speaking a non-compliant XML processor.)

    You can probably work around it by subclassing or monkey-patching the Python native ElementTree implementation's write() method to call _write on your extra PIs before _writeing the _root, but it could be a bit fragile.

    If you need support for the full XML infoset, probably best stick with DOM.

    0 讨论(0)
提交回复
热议问题