In a given XML file, I\'m trying to search for the presence of a string using XPath
in Java. However, even though the string is there, my output is always comi
I tested this...
expr =xpath.compile("/article/body/section/region[contains(translate(text(), 'ABCDEFGHIJKLMNOPQRSTUVWXYZ', 'abcdefghijklmnopqrstuvwxyz'), 'perfect')]");
Against
<article>
<body>
<section>
<h1>intro1</h1>
<region>perfect</region>
<region>Perfect</region>
</section>
<section>
<h1 class="pass">1 task objectives</h1>
<region>pErFeCt</region>
<region>Not Perfect</region>
</section>
<section>
<h1 class="pass">1 task objectives</h1>
<region>object1</region>
<region>This is the Perfect Word I am looking for</region>
</section>
</body>
</article>
Using...
import java.io.File;
import java.util.logging.Level;
import java.util.logging.Logger;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.xpath.XPath;
import javax.xml.xpath.XPathConstants;
import javax.xml.xpath.XPathExpression;
import javax.xml.xpath.XPathFactory;
import org.w3c.dom.Document;
import org.w3c.dom.Node;
import org.w3c.dom.NodeList;
public class TestXML05 {
public static void main(String[] args) {
try {
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
Document doc = factory.newDocumentBuilder().parse(new File("Sample.xml"));
XPathFactory xFactory = XPathFactory.newInstance();
XPath xPath = xFactory.newXPath();
XPathExpression exp = xPath.compile("/article/body/section/region[contains(translate(text(), 'ABCDEFGHIJKLMNOPQRSTUVWXYZ', 'abcdefghijklmnopqrstuvwxyz'), 'perfect')]");
NodeList nl = (NodeList)exp.evaluate(doc.getFirstChild(), XPathConstants.NODESET);
for (int index = 0; index < nl.getLength(); index++) {
Node node = nl.item(index);
System.out.println(node.getTextContent());
}
} catch (Exception ex) {
Logger.getLogger(TestXML05.class.getName()).log(Level.SEVERE, null, ex);
}
}
}
Which outputted...
perfect
Perfect
pErFeCt
Not Perfect
This is the Perfect Word I am looking for
XML/XPath are case-sensitive, your XPath should be
//article//body//section//region[contains(., 'Perfect')]
To make case-insensitive, use this
//article//body//section//region[
contains(translate(., 'ABCDEFGHIJKLMNOPQRSTUVWXYZ', 'abcdefghijklmnopqrstuvwxyz'),
'perfect')
]