I have a plain text file structured like this:
!ITEM_NAME
Item value
!ANOTHER_ITEM
Its value
...
Is it possible to get with XSLT a file sim
This XSLT 2.0 transformation:
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output omit-xml-declaration="yes" indent="yes"/>
<xsl:variable name="vText" select=
"replace(unparsed-text('file:///c:/temp/delete/text.txt'),'\r','')"/>
<xsl:template match="/">
<document>
<xsl:analyze-string select="$vText" regex="(!(.+?)\n([^\n]+))+">
<xsl:matching-substring>
<xsl:element name="{regex-group(2)}">
<xsl:sequence select="regex-group(3)"/>
</xsl:element>
</xsl:matching-substring>
<xsl:non-matching-substring><xsl:sequence select="."/></xsl:non-matching-substring>
</xsl:analyze-string>
</document>
</xsl:template>
</xsl:stylesheet>
when appliedon any XML document (not used) and having the provided text residing in the local file C:\temp\delete\Text.txt
:
!ITEM_NAME
Item value
!ANOTHER_ITEM
Its value
...
produces the wanted, correct result:
<document>
<ITEM_NAME>Item value</ITEM_NAME>
<ANOTHER_ITEM>Its value</ANOTHER_ITEM>
...
</document>
To test more completely, we put this text in the file:
As is text
!ITEM_NAME
Item value
!ANOTHER_ITEM
Its value
As is text2
!TEST_BANG
Here's a value with !bangs!!!
!TEST2_BANG
!!!Here's a value with !more~ !bangs!!!
As is text3
The transformation again produces the wanted, correct result:
<document>As is text
<ITEM_NAME>Item value</ITEM_NAME>
<ANOTHER_ITEM>Its value</ANOTHER_ITEM>
As is text2
<TEST_BANG>Here's a value with !bangs!!!</TEST_BANG>
<TEST2_BANG> !!!Here's a value with !more~ !bangs!!!</TEST2_BANG>
As is text3
</document>
If you can use XSLT 2.0 you could use unparsed-text()
...
Text File (Do not use the text file as direct input to the XSLT.)
!ITEM_NAME
Item value
!ANOTHER_ITEM
Its value
!TEST_BANG
Here's a value with !bangs!!!
XSLT 2.0 (Apply this XSLT to itself (use the stylesheet as the XML input). You'll also have to change the path to your text file. You might have to change the encoding too.)
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:xs="http://www.w3.org/2001/XMLSchema" exclude-result-prefixes="xs">
<xsl:output indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:param name="text-encoding" as="xs:string" select="'iso-8859-1'"/>
<xsl:param name="text-uri" as="xs:string" select="'file:///C:/Users/dhaley/Desktop/test.txt'"/>
<xsl:template name="text2xml">
<xsl:variable name="text" select="unparsed-text($text-uri, $text-encoding)"/>
<xsl:analyze-string select="$text" regex="!(.*)\n(.*)">
<xsl:matching-substring>
<xsl:element name="{normalize-space(regex-group(1))}">
<xsl:value-of select="normalize-space(regex-group(2))"/>
</xsl:element>
</xsl:matching-substring>
</xsl:analyze-string>
</xsl:template>
<xsl:template match="/">
<document>
<xsl:choose>
<xsl:when test="unparsed-text-available($text-uri, $text-encoding)">
<xsl:call-template name="text2xml"/>
</xsl:when>
<xsl:otherwise>
<xsl:variable name="error">
<xsl:text>Error reading "</xsl:text>
<xsl:value-of select="$text-uri"/>
<xsl:text>" (encoding "</xsl:text>
<xsl:value-of select="$text-encoding"/>
<xsl:text>").</xsl:text>
</xsl:variable>
<xsl:message><xsl:value-of select="$error"/></xsl:message>
<xsl:value-of select="$error"/>
</xsl:otherwise>
</xsl:choose>
</document>
</xsl:template>
</xsl:stylesheet>
XML Output
<document>
<ITEM_NAME>Item value</ITEM_NAME>
<ANOTHER_ITEM>Its value</ANOTHER_ITEM>
<TEST_BANG>Here's a value with !bangs!!!</TEST_BANG>
</document>