Transform XML to HTML in XSLT with string length condition

橙三吉。 提交于 2019-12-08 02:27:23

问题


I have a XML file using TEI build like that:

<div type="chapter" n="1">
        <p>
          <s xml:id="e_1">sentence e1.</s>
          <s xml:id="f_1">sentence f1</s>
        </p>
        <p>
            <s xml:id="e_2"> sentence e2</s>
            <s xml:id="f_2"> sentence f2</s>
        </p>
</div>

<div type="chapter" n="2">
        <!-- -->
</div>

I need to transform it to this HTML structure:

<div>
<h1>Chapter 1</h1>
<div class="book-content">
 <p>
    <span class='source-language-sent' data-source-id='1'>sentence e1.</span>
    <span id='1' style='display:none'>sentence f1</span>
 </p>
 <p>
    <span class='source-language-sent' data-source-id='2'>sentence e2</span>
    <span id='2' style='display:none'>sentence f2</span>
 </p>
</div>
</div>
<div>
<h1>Chapter 2</h1>
<div class="book-content">
  <!-- -->
</div>
</div>

for now I use this XSLT file:

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:tei="http://www.tei-c.org/ns/1.0" version="1.0">
   <xsl:output method="html" encoding="UTF-8" indent="yes" />

   <xsl:template match="tei:body">
      <xsl:apply-templates />
   </xsl:template>

   <xsl:template match="tei:teiHeader">
      <xsl:comment>
         <xsl:apply-templates select="node()" />
      </xsl:comment>
   </xsl:template>

   <!--create chapter-->
   <xsl:template match="tei:div">
      <xsl:element name="div">
         <xsl:element name="div">
            <xsl:attribute name="class">
               <xsl:text>book-content</xsl:text>
            </xsl:attribute>
            <xsl:element name="h1">
               <xsl:text>Chapter</xsl:text>
               <xsl:value-of select="@n" />
            </xsl:element>
            <xsl:apply-templates select="node()" />
         </xsl:element>
      </xsl:element>
   </xsl:template>

   <!-- create p-->
   <xsl:template match="tei:p">
      <xsl:element name="p">
         <xsl:apply-templates />
      </xsl:element>
   </xsl:template>

   <!-- create s-->
   <xsl:template match="tei:s">
      <xsl:variable name="xmlid" select="@xml:id" />
      <xsl:if test="starts-with($xmlid, 'e')">
         <xsl:element name="span">
            <xsl:attribute name="class">
               <xsl:text>source-language-sent</xsl:text>
            </xsl:attribute>
            <xsl:attribute name="data-source-id">
               <xsl:value-of select="substring($xmlid, 3, 4)" />
            </xsl:attribute>
            <xsl:apply-templates select="node()" />
         </xsl:element>
      </xsl:if>
      <xsl:if test="starts-with($xmlid, 'f')">
         <xsl:element name="span">
            <xsl:attribute name="style">
               <xsl:text>display:none</xsl:text>
            </xsl:attribute>
            <xsl:attribute name="id">
               <xsl:value-of select="substring($xmlid, 3, 4)" />
            </xsl:attribute>
            <xsl:apply-templates select="node()" />
         </xsl:element>
      </xsl:if>
   </xsl:template>

</xsl:stylesheet>

My problem is that I need to create a new <div class="book-content"> foreach 900 characters. But I don't want to cut my s elements so I need to calculate how many selement do I have to include in one <div class="book-content">to have somethings like 900 characters.


回答1:


This is an interesting problem, but your example has too much of other things going on. I prefer to solve this in isolation, using my own example.

Consider the following input:

XML

<book>
    <chapter id="A">
        <para>
            <sentence id="1" length="23">Mary had a little lamb,</sentence>
            <sentence id="2" length="29">His fleece was white as snow,</sentence>
            <sentence id="3" length="30">And everywhere that Mary went,</sentence>
        </para>
        <para>
            <sentence id="4" length="24">The lamb was sure to go.</sentence>
            <sentence id="5" length="34">He followed her to school one day,</sentence>
        </para>
        <para>
            <sentence id="6" length="27">Which was against the rule,</sentence>
            <sentence id="7" length="35">It made the children laugh and play</sentence>
            <sentence id="8" length="24">To see a lamb at school.</sentence>
        </para>
        <para>
            <sentence id="9" length="34">And so the teacher turned it out, </sentence>
            <sentence id="10" length="27">But still it lingered near.</sentence>
        </para>
    </chapter>
    <chapter id="B">
        <para>
            <sentence id="11" length="35">Summertime, and the livin' is easy.</sentence>
            <sentence id="12" length="40">Fish are jumpin' and the cotton is high.</sentence>
            <sentence id="13" length="52">Oh, Your daddy's rich and your mamma's good lookin'.</sentence>
            <sentence id="14" length="35">So hush little baby, don't you cry.</sentence>
            <sentence id="15" length="54">One of these mornings you're going to rise up singing.</sentence>
        </para>
        <para>
            <sentence id="16" length="57">Then you'll spread your wings and you'll take to the sky.</sentence>
            <sentence id="17" length="35">So hush little baby, don't you cry.</sentence>
        </para>
    </chapter>
</book>

Note: the length values are given for illustration only; we will not be using them in the solution.

Our task is to split each chapter whose total length exceeds 200 characters into several chapters, by moving whole sentences only, while preserving the original para boundaries between groups of sentences.

XSLT 1.0

<xsl:stylesheet version="1.0" 
xmlns:xsl="http://www.w3.org/1999/XSL/Transform" 
xmlns:exsl="http://exslt.org/common"
xmlns:set="http://exslt.org/sets"
extension-element-prefixes="exsl set">
<xsl:output method="xml" version="1.0" encoding="utf-8" indent="yes"/>
<xsl:strip-space elements="*"/>

<!-- identity transform -->
<xsl:template match="@*|node()">
    <xsl:copy>
        <xsl:apply-templates select="@*|node()"/>
    </xsl:copy>
</xsl:template>

<xsl:template match="chapter">
    <xsl:call-template name="split-chapter">
        <xsl:with-param name="nodes" select="para/sentence"/>
    </xsl:call-template>
</xsl:template>

<xsl:template name="split-chapter">
    <xsl:param name="nodes"/>
    <xsl:param name="limit" select="200"/>
    <xsl:param name="remaining-nodes" select="dummy-node"/>
    <!-- 1. Calculate the total length of nodes -->
    <xsl:variable name="lengths">
        <xsl:for-each select="$nodes">
            <length>
                <xsl:value-of select="string-length()" />
            </length>
        </xsl:for-each>
    </xsl:variable>
    <xsl:variable name="total-length" select="sum(exsl:node-set($lengths)/length)" />
    <!-- 2. Process the chapter: -->
    <xsl:choose>
        <!-- If chapter is too long and can be shortened ... -->
        <xsl:when test="$total-length > $limit and count($nodes) > 1">
            <!-- ... try again with one node less. -->
            <xsl:call-template name="split-chapter">
                <xsl:with-param name="nodes" select="$nodes[not(position()=last())]"/>
                <xsl:with-param name="remaining-nodes" select="$remaining-nodes | $nodes[last()]"/>
            </xsl:call-template>
        </xsl:when>
        <xsl:otherwise>
            <!-- Otherwise create a chapter with the current nodes ... -->
            <chapter id="{@id}" length="{$total-length}" >
                <!-- ... list the paras participating in this chapter ... -->
                <xsl:for-each select="$nodes/parent::para">
                    <para>
                        <!-- ... and process the nodes still left in each para. -->
                        <xsl:apply-templates select="set:intersection(sentence, $nodes)"/>
                    </para>
                </xsl:for-each>
            </chapter>
            <!-- Then process any remaining nodes. -->
            <xsl:if test="$remaining-nodes">
                <xsl:call-template name="split-chapter">
                    <xsl:with-param name="nodes" select="$remaining-nodes"/>
                </xsl:call-template>
            </xsl:if>
        </xsl:otherwise>
    </xsl:choose>
</xsl:template>

</xsl:stylesheet>

Result

<?xml version="1.0" encoding="utf-8"?>
<book>
   <chapter id="A" length="167">
      <para>
         <sentence id="1" length="23">Mary had a little lamb,</sentence>
         <sentence id="2" length="29">His fleece was white as snow,</sentence>
         <sentence id="3" length="30">And everywhere that Mary went,</sentence>
      </para>
      <para>
         <sentence id="4" length="24">The lamb was sure to go.</sentence>
         <sentence id="5" length="34">He followed her to school one day,</sentence>
      </para>
      <para>
         <sentence id="6" length="27">Which was against the rule,</sentence>
      </para>
   </chapter>
   <chapter id="A" length="120">
      <para>
         <sentence id="7" length="35">It made the children laugh and play</sentence>
         <sentence id="8" length="24">To see a lamb at school.</sentence>
      </para>
      <para>
         <sentence id="9" length="34">And so the teacher turned it out, </sentence>
         <sentence id="10" length="27">But still it lingered near.</sentence>
      </para>
   </chapter>
   <chapter id="B" length="162">
      <para>
         <sentence id="11" length="35">Summertime, and the livin' is easy.</sentence>
         <sentence id="12" length="40">Fish are jumpin' and the cotton is high.</sentence>
         <sentence id="13" length="52">Oh, Your daddy's rich and your mamma's good lookin'.</sentence>
         <sentence id="14" length="35">So hush little baby, don't you cry.</sentence>
      </para>
   </chapter>
   <chapter id="B" length="146">
      <para>
         <sentence id="15" length="54">One of these mornings you're going to rise up singing.</sentence>
      </para>
      <para>
         <sentence id="16" length="57">Then you'll spread your wings and you'll take to the sky.</sentence>
         <sentence id="17" length="35">So hush little baby, don't you cry.</sentence>
      </para>
   </chapter>
</book>


来源:https://stackoverflow.com/questions/29866437/transform-xml-to-html-in-xslt-with-string-length-condition

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!