how to read and change <!Doctype> tag and <?xml version=“1.0”?> in xml twig?

巧了我就是萌 提交于 2019-12-01 14:07:19

You can try the following (code it's commented). The important point to understand it is to create a new twig, copy all the elements you want to keep and create what it changes:

#!/usr/bin/env perl

use warnings;
use strict;
use XML::Twig;

## Create a twig based in an input xml file.
my $twig = XML::Twig->new;
$twig->parsefile(shift);

## Create a new twig that will be the output.
my $new_twig = XML::Twig->new( pretty_print => 'indented' );

## Create a root tag.
$new_twig->set_root( XML::Twig::Elt->new( 'root' ) );

## Create the xml processing instruction.
my $e = XML::Twig::Elt->new( 'k' => 'v' );
$e->set_pi( 'xml', 'version="1.0" encoding="UTF-8" standalone="yes"' );
$e->move( before => $new_twig->root );

## Copy the whole tree from the old twig.
my $r = $twig->root;
$r->paste( first_child => $new_twig->root );

## Copy the doctype from the old twig to the new one.
my $contents_elt = XML::Twig::Elt->new( Contents  => { type => $twig->doctype } );
my $dtd_elt = XML::Twig::Elt->new( DTD => '#EMPTY' );
$contents_elt->move( last_child => $dtd_elt );
$dtd_elt->move( first_child => $new_twig->root );

## Print the whole twig created.
$new_twig->print;

Run it like:

perl script.pl xmlfile

That yields:

  <?xml version="1.0" encoding="UTF-8" standalone="yes"?><root>
  <DTD>
    <Contents type="&lt;!DOCTYPE art SYSTEM &quot;loose.dtd&quot;>&#x0a;"/>
  </DTD>
  <art>
    <fr>
      <p>Text</p>
      <p>Text</p>
    </fr>
    <fr>
      <p>Text</p>
      <p>Text</p>
    </fr>
  </art>
</root>
Sobrique

Having found this question when trying to do something similar: Assembling XML in Perl

You probably don't want set_pi to do the XML header, and instead:

$twig->set_xml_version("1.0");
$twig->set_encoding('utf-8');
$twig->set_standalone('yes');

The XML::Twig doc mentions DTD handling though:

DTD handling The DTD handling methods are quite bugged. No one uses them and it seems very >difficult to get them to work in all cases, including with several slightly >incompatible versions of XML::Parser and of libexpat.

Basically you can read the DTD, output it back properly, and update entities, >but not much more.

So use XML::Twig with standalone documents, or with documents referring to an >external DTD, but don't expect it to properly parse and even output back the >DTD.

With that in mind, the solution you've got above from Birei will probably be the best way of handling it.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!