I need to get the value of each data after the last colon. For example, I have this file:
:21:Not used
I wrote this answer to your original post but didn't post it because it was essentially similar to the one posted by zx485.
However, I still recommend using a key to retrieve the corresponding element name (and I also think the regex can be simpler and more robust).
I have added a tokenizing step to split the data into separate <Data>
wrappers on every double line-feed character.
XSLT 2.0
<xsl:stylesheet version="2.0"
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
<xsl:strip-space elements="*"/>
<!-- identity transform -->
<xsl:template match="@*|node()">
<xsl:apply-templates select="@*|node()"/>
<xsl:variable name="map">
<name key="20">ABCD</name>
<name key="21">EFGH</name>
<name key="25">IJKL</name>
<name key="28C">MNOP</name>
<xsl:key name="nm" match="name" use="@key" />
<xsl:template match="Data">
<xsl:for-each select="tokenize(., '\n\n')">
<xsl:analyze-string select="." regex="^:([^:]*):(.*)$" flags="m">
<xsl:element name="{key('nm', upper-case(regex-group(1)), $map)}">
<xsl:value-of select="regex-group(2)" />
Demo: http://xsltransform.net/ehVYZNm
Is there an easy way on will I get the value after each last colon because there's so many keys[...]
Yes. You can use RegEx matching.
In the following template regex-group(2)
contains the string after the second/(last) colon. And regex-group(1)
contains the key.
<xsl:template match="Data">
<xsl:analyze-string select="." regex=":([0-9A-Za-z]+):(.*)\n">
(<xsl:value-of select="regex-group(1)" /> --- <xsl:value-of select="regex-group(2)" />)<xsl:text>
Partial output:
(20 --- PmtReferenceID000012)
(21 --- Not used)
(25 --- PHMNLBICXXX/Account00010203)
(28c --- 00001/0001 (The 'c' in :28 can be either in upper or lower case))
With that you can create a key/value Dictionary that creates the tags around the text.
Like this:
For example: you can create a variable inside the XSL file to store the mapping:
<xsl:stylesheet version="2.0"
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
<xsl:variable name="Mapping">
<Map key="20">ABCD</Map>
<Map key="21">EFGH</Map>
<Map key="25">IJKL</Map>
<Map key="28c">MNOP</Map>
<xsl:template match="@*|node()">
<xsl:apply-templates select="@*|node()"/>
<xsl:template match="Data">
<xsl:analyze-string select="." regex=":([0-9A-Za-z]+):(.*)\n">
<xsl:element name="{$Mapping/Map[@key=regex-group(1)]/text()}"><xsl:value-of select="regex-group(2)" /></xsl:element>
Full output:
<?xml version="1.0" encoding="UTF-8"?>
<Data xmlns:xs="http://www.w3.org/2001/XMLSchema"
<EFGH>Not used</EFGH>
<IJKL> PHMNLBICXXX/Account00010203</IJKL>
<MNOP>00001/0001 (The 'c' in :28 can be either in upper or lower case)</MNOP>
Or you could literally outsource the mapping and create a separate file for them...
In XSLT 3.0 you could write templates for the different strings e.g.
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:math="http://www.w3.org/2005/xpath-functions/math" exclude-result-prefixes="xs math"
<xsl:output indent="yes"/>
<xsl:template match="Data">
<xsl:apply-templates select="tokenize(., '\r?\n')[normalize-space()]"/>
<xsl:template match=".[. instance of xs:string and matches(., '^:20:')]">
<xsl:value-of select="replace(., '^:20:', '')"/>
<xsl:template match=".[. instance of xs:string and matches(., '^:21:')]">
<xsl:value-of select="replace(., '^:21:', '')"/>
<xsl:template match=".[. instance of xs:string and matches(., '^:25:')]">
<xsl:value-of select="replace(., '^:25:', '')"/>
<xsl:template match=".[. instance of xs:string and matches(., '^:28c:', 'i')]">
<xsl:value-of select="replace(., '^:28c:', '', 'i')"/>
With Saxon 9.8 or Altova XMLSpy/Raptor that does the job and outputs
<EFGH>Not used</EFGH>
<IJKL> PHMNLBICXXX/Account00010203</IJKL>
(for the input
:21:Not used
:25: PHMNLBICXXX/Account00010203
As an alternative, instead of tokenizing and processing strings you could use the analyze-string
function and match on the returned fn:match
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
exclude-result-prefixes="xs fn math"
<xsl:output indent="yes"/>
<xsl:template match="Data">
<xsl:apply-templates select="analyze-string(., '^(:[0-9]+[a-z]*:)(.*)\r?\n', 'im')//fn:match"/>
<xsl:template match="fn:match[fn:group[@nr = 1][. = ':20:']]">
<xsl:value-of select="fn:group[@nr = 2]"/>
<xsl:template match="fn:match[fn:group[@nr = 1][. = ':21:']]">
<xsl:value-of select="fn:group[@nr = 2]"/>
<xsl:template match="fn:match[fn:group[@nr = 1][. = ':25:']]">
<xsl:value-of select="fn:group[@nr = 2]"/>
<xsl:template match="fn:match[fn:group[@nr = 1][matches(., '^:28c:', 'i')]]">
<xsl:value-of select="fn:group[@nr = 2]"/>
Finally, taking up the idea of a map parameter to define the element names the second solution can be shortened to
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
exclude-result-prefixes="xs fn math"
<xsl:param name="map" as="map(xs:string, xs:string)"
select="map {
'20' : 'ABCD',
'21' : 'EFGH',
'25' : 'IJKL',
'28c' : 'MNOP'
<xsl:output indent="yes"/>
<xsl:template match="Data">
<xsl:apply-templates select="analyze-string(., '^(:([0-9]+[a-z]*):)(.*)\r?\n', 'im')//fn:match" mode="wrap"/>
<xsl:template match="match" mode="wrap" xpath-default-namespace="http://www.w3.org/2005/xpath-functions">
<xsl:element name="{$map(lower-case(.//group[@nr = 2]))}">
<xsl:value-of select="group[@nr = 3]"/>