Does anyone know of an existing means of creating an XML hierarchy programatically from an XPath expression?
For example if I have an XML fragment such as:
I know this is a really old thread ... but I have just been trying the same thing and came up with the following regex which is not perfect but I find more generic
/+([\w]+)(\[@([\w]+)='([^']*)'\])?|/@([\w]+)
The string /configuration/appSettings/add[@key='name']/@value
should be parsed to
Found 14 match(es):
start=0, end=14 Group(0) = /configuration Group(1) = configuration Group(2) = null Group(3) = null Group(4) = null Group(5) = null
start=14, end=26 Group(0) = /appSettings Group(1) = appSettings Group(2) = null Group(3) = null Group(4) = null Group(5) = null
start=26, end=43 Group(0) = /add[@key='name'] Group(1) = add Group(2) = [@key='name'] Group(3) = key Group(4) = name Group(5) = null
start=43, end=50 Group(0) = /@value Group(1) = null Group(2) = null Group(3) = null Group(4) = null Group(5) = value
Which means we have
Group(0) = Ignored Group(1) = The element name Group(2) = Ignored Group(3) = Filter attribute name Group(4) = Filter attribute value
Here is a java method which can use the pattern
public static Node createNodeFromXPath(Document doc, String expression) throws XPathExpressionException {
StringBuilder currentPath = new StringBuilder();
Matcher matcher = xpathParserPattern.matcher(expression);
Node currentNode = doc.getFirstChild();
while (matcher.find()) {
String currentXPath = matcher.group(0);
String elementName = matcher.group(1);
String filterName = matcher.group(3);
String filterValue = matcher.group(4);
String attributeName = matcher.group(5);
StringBuilder builder = currentPath.append(currentXPath);
String relativePath = builder.toString();
Node newNode = selectSingleNode(doc, relativePath);
if (newNode == null) {
if (attributeName != null) {
((Element) currentNode).setAttribute(attributeName, "");
newNode = selectSingleNode(doc, relativePath);
} else if (elementName != null) {
Element element = doc.createElement(elementName);
if (filterName != null) {
element.setAttribute(filterName, filterValue);
}
currentNode.appendChild(element);
newNode = element;
} else {
throw new UnsupportedOperationException("The given xPath is not supported " + relativePath);
}
}
currentNode = newNode;
}
if (selectSingleNode(doc, expression) == null) {
throw new IllegalArgumentException("The given xPath cannot be created " + expression);
}
return currentNode;
}
In the example you present the only thing being created is the attribute ...
XmlElement element = (XmlElement)doc.SelectSingleNode("/feed/entry/content");
if (element != null)
element.SetAttribute("source", "");
If what you really want is to be able to create the hierarchy where it doesn't exist then you could your own simple xpath parser. I don't know about keeping the attribute in the xpath though. I'd rather cast the node as an element and tack on a .SetAttribute as I've done here:
static private XmlNode makeXPath(XmlDocument doc, string xpath)
{
return makeXPath(doc, doc as XmlNode, xpath);
}
static private XmlNode makeXPath(XmlDocument doc, XmlNode parent, string xpath)
{
// grab the next node name in the xpath; or return parent if empty
string[] partsOfXPath = xpath.Trim('/').Split('/');
string nextNodeInXPath = partsOfXPath.First();
if (string.IsNullOrEmpty(nextNodeInXPath))
return parent;
// get or create the node from the name
XmlNode node = parent.SelectSingleNode(nextNodeInXPath);
if (node == null)
node = parent.AppendChild(doc.CreateElement(nextNodeInXPath));
// rejoin the remainder of the array as an xpath expression and recurse
string rest = String.Join("/", partsOfXPath.Skip(1).ToArray());
return makeXPath(doc, node, rest);
}
static void Main(string[] args)
{
XmlDocument doc = new XmlDocument();
doc.LoadXml("<feed />");
makeXPath(doc, "/feed/entry/data");
XmlElement contentElement = (XmlElement)makeXPath(doc, "/feed/entry/content");
contentElement.SetAttribute("source", "");
Console.WriteLine(doc.OuterXml);
}
One problem with this idea is that xpath "destroys" information.
There are an infinite number of xml trees that can match many xpaths. Now in some cases, like the example you give, there is an obvious minimal xml tree which matches your xpath, where you have a predicate that uses "=".
But for example if the predicate uses not equal, or any other arithmetic operator other than equal, an infinite number of possibilities exist. You could try to choose a "canonical" xml tree which requires, say, the fewest bits to represent.
Suppose for example you had xpath /feed/entry/content[@source > 0]
. Now any xml tree of the appropriate structure in which node content had an attribute source whose value was > 0 would match, but there are an infinite number of numbers greater than zero. By choosing the "minimal" value, presumably 1, you could attempt to canonicalize your xml.
Xpath predicates can contain pretty arbitrary arithmetic expressions, so the general solution to this is quite difficult, if not impossible. You could imagine a huge equation in there, and it would have to be solved in reverse to come up with values that would match the equation; but since there can be an infinite number of matching values (as long as it's really an inequality not an equation), a canonical solution would need to be found.
Many expressions of other forms also destroy information. For example, an operator like "or" always destroys information. If you know that (X or Y) == 1
, you don't know if X is 1, Y is 1, or both of them is 1; all you know for sure is that one of them is 1! Therefore if you have an expression using OR, you cannot tell which of the nodes or values that are inputs to the OR should be 1 (you can make an arbitrary choice and set both 1, as that will satisfy the expression for sure, as will the two choices in which only one of them is 1).
Now suppose there are several expressions in the xpath which refer to the same set of values. You then end up with a system of simultaneous equations or inequalities that can be virtually impossible to solve. Again, if you restrict the allowable xpath to a small subset of its full power, you can solve this problem. I suspect the fully general case is similar to the Turing halting problem, however; in this case, given an arbitrary program (the xpath), figure out a set of consistent data that matches the program, and is in some sense minimal.
If the XPath string is processed from back to front, its easier to process non rooted XPaths eg. //a/b/c... It should support Gordon's XPath syntax too although I have not tried...
static private XmlNode makeXPath(XmlDocument doc, string xpath)
{
string[] partsOfXPath = xpath.Split('/');
XmlNode node = null;
for (int xpathPos = partsOfXPath.Length; xpathPos > 0; xpathPos--)
{
string subXpath = string.Join("/", partsOfXPath, 0, xpathPos);
node = doc.SelectSingleNode(subXpath);
if (node != null)
{
// append new descendants
for (int newXpathPos = xpathPos; newXpathPos < partsOfXPath.Length; newXpathPos++)
{
node = node.AppendChild(doc.CreateElement(partsOfXPath[newXpathPos]));
}
break;
}
}
return node;
}
Here is my version. Hope this also would help someone.
public static void Main(string[] args)
{
XmlDocument doc = new XmlDocument();
XmlNode rootNode = GenerateXPathXmlElements(doc, "/RootNode/FirstChild/SecondChild/ThirdChild");
Console.Write(rootNode.OuterXml);
}
private static XmlDocument GenerateXPathXmlElements(XmlDocument xmlDocument, string xpath)
{
XmlNode parentNode = xmlDocument;
if (xmlDocument != null && !string.IsNullOrEmpty(xpath))
{
string[] partsOfXPath = xpath.Split('/');
string xPathSoFar = string.Empty;
foreach (string xPathElement in partsOfXPath)
{
if(string.IsNullOrEmpty(xPathElement))
continue;
xPathSoFar += "/" + xPathElement.Trim();
XmlNode childNode = xmlDocument.SelectSingleNode(xPathSoFar);
if(childNode == null)
{
childNode = xmlDocument.CreateElement(xPathElement);
}
parentNode.AppendChild(childNode);
parentNode = childNode;
}
}
return xmlDocument;
}
The C# version of Mark Miller's Java solution
/// <summary>
/// Makes the X path. Use a format like //configuration/appSettings/add[@key='name']/@value
/// </summary>
/// <param name="doc">The doc.</param>
/// <param name="xpath">The xpath.</param>
/// <returns></returns>
public static XmlNode createNodeFromXPath(XmlDocument doc, string xpath)
{
// Create a new Regex object
Regex r = new Regex(@"/+([\w]+)(\[@([\w]+)='([^']*)'\])?|/@([\w]+)");
// Find matches
Match m = r.Match(xpath);
XmlNode currentNode = doc.FirstChild;
StringBuilder currentPath = new StringBuilder();
while (m.Success)
{
String currentXPath = m.Groups[0].Value; // "/configuration" or "/appSettings" or "/add"
String elementName = m.Groups[1].Value; // "configuration" or "appSettings" or "add"
String filterName = m.Groups[3].Value; // "" or "key"
String filterValue = m.Groups[4].Value; // "" or "name"
String attributeName = m.Groups[5].Value; // "" or "value"
StringBuilder builder = currentPath.Append(currentXPath);
String relativePath = builder.ToString();
XmlNode newNode = doc.SelectSingleNode(relativePath);
if (newNode == null)
{
if (!string.IsNullOrEmpty(attributeName))
{
((XmlElement)currentNode).SetAttribute(attributeName, "");
newNode = doc.SelectSingleNode(relativePath);
}
else if (!string.IsNullOrEmpty(elementName))
{
XmlElement element = doc.CreateElement(elementName);
if (!string.IsNullOrEmpty(filterName))
{
element.SetAttribute(filterName, filterValue);
}
currentNode.AppendChild(element);
newNode = element;
}
else
{
throw new FormatException("The given xPath is not supported " + relativePath);
}
}
currentNode = newNode;
m = m.NextMatch();
}
// Assure that the node is found or created
if (doc.SelectSingleNode(xpath) == null)
{
throw new FormatException("The given xPath cannot be created " + xpath);
}
return currentNode;
}