How to convert tags in all tags in xml to lowercase without changing case of atribute values?

前端 未结 4 626
面向向阳花
面向向阳花 2021-01-25 08:10

I\'ve inherited some xml files which has all tags in uppercase. I would like to convert them to lowercase using either a regular expression or via XSLT. It would be handy to be

相关标签:
4条回答
  • 2021-01-25 08:14

    By using PHP you can do it like this...

    <?php
    
    $pattern= '/<\\w+|<\/\\w+/';
    $fp = fopen("/Applications/XAMPP/htdocs/test/test.xml", "r") or die("can't read stdin");
    while (!feof($fp)) {
        $line = fgets($fp);
        $line = preg_replace_callback(
            $pattern,
            function ($matches) {
                return strtolower($matches[0]);
            },
            $line
        );
        echo htmlentities($line);
    }
    fclose($fp);
    ?>
    

    It work fine ;)

    0 讨论(0)
  • try to this regex:

    <(\/?[a-zA-Z]*)\b.*?>
    

    online tester: http://regex101.com/#PCRE

    Enjoy your code

    0 讨论(0)
  • 2021-01-25 08:25

    You might need 2 regexes in my opinion - one to convert the tag name, and another to convert the variable number of attribute-value pairs.

    Here is how I could do it -

    blah:tmp shreyas$ cat old.xml | perl -pe "s|(</?)([^> ]+)(.*?>)|\1\L\2\E\3|g" | perl -pe "s|(\w+)( ?= ?\".*?\")|\L\1\E\2|g" > processed.xml
    blah:tmp shreyas$ diff new.xml processed.xml 
    4c4
    <     <P>It would be remiss of me to neglect to thank the bottle.</P>
    ---
    >     <p>It would be remiss of me to neglect to thank the bottle.</p>
    9,10c9,10
    <       <P>It seems a violent betrayal, me divulging how...</P>
    <       <P>The years had not been kind Felix Lake. His constant...</P>
    ---
    >       <p>It seems a violent betrayal, me divulging how...</p>
    >       <p>The years had not been kind Felix Lake. His constant...</p>
    15c15
    <     <P>As luck would not have it, he did.</P>
    ---
    >     <p>As luck would not have it, he did.</p>
    

    old.xml is your Before xml and new.xml is your After xml. processed.xml is the one generated by the command.

    As you can see, the P tags in your after xml are still capital. I am not sure if they were typos or exceptions. I trreated them as typos since you mentioned changing all tags to small case.

    With a small modification, you could run these commands on all of you inherited set of XMLs, and get them converted quickly.

    0 讨论(0)
  • 2021-01-25 08:32

    Try (untested):

    XSLT 2.0:

    <xsl:stylesheet version="2.0" 
    xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
    <xsl:strip-space elements="*"/>
    
    <xsl:template match="*">
        <xsl:element name="{lower-case(local-name())}" namespace="{namespace-uri()}">
            <xsl:apply-templates select="@*|node()"/>
        </xsl:element>
    </xsl:template>
    
    <xsl:template match="@*">
        <xsl:attribute name="{lower-case(local-name())}" namespace="{namespace-uri()}">
            <xsl:value-of select="."/>
        </xsl:attribute>
    </xsl:template>
    
    <xsl:template match="comment() | text() | processing-instruction()">
        <xsl:copy/>
    </xsl:template>
    
    </xsl:stylesheet>
    

    The XSLT 1.0 version of the above would go like this:

    <xsl:stylesheet version="1.0" 
    xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
    <xsl:strip-space elements="*"/>
    
    <xsl:variable name="uppercase" select="'ABCDEFGHIJKLMNOPQRSTUVWXYZ'" />
    <xsl:variable name="lowercase" select="'abcdefghijklmnopqrstuvwxyz'" />
    
    <xsl:template match="*">
        <xsl:element name="{translate(local-name(), $uppercase, $lowercase)}" namespace="{namespace-uri()}">
            <xsl:apply-templates select="@*|node()"/>
        </xsl:element>
    </xsl:template>
    
    <xsl:template match="@*">
        <xsl:attribute name="{translate(local-name(), $uppercase, $lowercase)}" namespace="{namespace-uri()}">
            <xsl:value-of select="."/>
        </xsl:attribute>
    </xsl:template>
    
    <xsl:template match="comment() | text() | processing-instruction()">
        <xsl:copy/>
    </xsl:template>
    
    </xsl:stylesheet>
    

    However, this is assuming your element and attribute names do not contain upper-case characters other than the 26 explicitly listed (i.e. no Russian, Greek, diacritics, etc.).

    0 讨论(0)
提交回复
热议问题