Create XML Nodes based on XPath?

49,753

Solution 1

In the example you present the only thing being created is the attribute ...

XmlElement element = (XmlElement)doc.SelectSingleNode("/feed/entry/content");
if (element != null)
    element.SetAttribute("source", "");

If what you really want is to be able to create the hierarchy where it doesn't exist then you could your own simple xpath parser. I don't know about keeping the attribute in the xpath though. I'd rather cast the node as an element and tack on a .SetAttribute as I've done here:


static private XmlNode makeXPath(XmlDocument doc, string xpath)
{
    return makeXPath(doc, doc as XmlNode, xpath);
}

static private XmlNode makeXPath(XmlDocument doc, XmlNode parent, string xpath)
{
    // grab the next node name in the xpath; or return parent if empty
    string[] partsOfXPath = xpath.Trim('/').Split('/');
    string nextNodeInXPath = partsOfXPath.First();
    if (string.IsNullOrEmpty(nextNodeInXPath))
        return parent;

    // get or create the node from the name
    XmlNode node = parent.SelectSingleNode(nextNodeInXPath);
    if (node == null)
        node = parent.AppendChild(doc.CreateElement(nextNodeInXPath));

    // rejoin the remainder of the array as an xpath expression and recurse
    string rest = String.Join("/", partsOfXPath.Skip(1).ToArray());
    return makeXPath(doc, node, rest);
}

static void Main(string[] args)
{
    XmlDocument doc = new XmlDocument();
    doc.LoadXml("<feed />");

    makeXPath(doc, "/feed/entry/data");
    XmlElement contentElement = (XmlElement)makeXPath(doc, "/feed/entry/content");
    contentElement.SetAttribute("source", "");

    Console.WriteLine(doc.OuterXml);
}

Solution 2

Here's my quick hack that can also create attributes as long as you use a format like /configuration/appSettings/add[@key='name']/@value.

static XmlNode createXPath(XmlDocument doc, string xpath)
{
  XmlNode node=doc;
  foreach (string part in xpath.Substring(1).Split('/'))
  {
    XmlNodeList nodes=node.SelectNodes(part);
    if (nodes.Count>1) throw new ComponentException("Xpath '"+xpath+"' was not found multiple times!");
    else if (nodes.Count==1) { node=nodes[0]; continue; }

    if (part.StartsWith("@"))
    {
      var anode=doc.CreateAttribute(part.Substring(1));
      node.Attributes.Append(anode);
      node=anode;
    }
    else
    {
      string elName, attrib=null;
      if (part.Contains("["))
      {
        part.SplitOnce("[", out elName, out attrib);
        if (!attrib.EndsWith("]")) throw new ComponentException("Unsupported XPath (missing ]): "+part);
        attrib=attrib.Substring(0, attrib.Length-1);
      }
      else elName=part;

      XmlNode next=doc.CreateElement(elName);
      node.AppendChild(next);
      node=next;

      if (attrib!=null)
      {
        if (!attrib.StartsWith("@")) throw new ComponentException("Unsupported XPath attrib (missing @): "+part);
        string name, value;
        attrib.Substring(1).SplitOnce("='", out name, out value);
        if (string.IsNullOrEmpty(value) || !value.EndsWith("'")) throw new ComponentException("Unsupported XPath attrib: "+part);
        value=value.Substring(0, value.Length-1);
        var anode=doc.CreateAttribute(name);
        anode.Value=value;
        node.Attributes.Append(anode);
      }
    }
  }
  return node;
}

SplitOnce is an extension method:

public static void SplitOnce(this string value, string separator, out string part1, out string part2)
{
  if (value!=null)
  {
    int idx=value.IndexOf(separator);
    if (idx>=0)
    {
      part1=value.Substring(0, idx);
      part2=value.Substring(idx+separator.Length);
    }
    else
    {
      part1=value;
      part2=null;
    }
  }
  else
  {
    part1="";
    part2=null;
  }
}

Sample:

public static void Set(XmlDocument doc, string xpath, string value)
{
  if (doc==null) throw new ArgumentNullException("doc");
  if (string.IsNullOrEmpty(xpath)) throw new ArgumentNullException("xpath");

  XmlNodeList nodes=doc.SelectNodes(xpath);
  if (nodes.Count>1) throw new ComponentException("Xpath '"+xpath+"' was not found multiple times!");
  else if (nodes.Count==0) createXPath(doc, xpath).InnerText=value;
  else nodes[0].InnerText=value;
}

e.g.

Set(doc, "/configuration/appSettings/add[@key='Server']/@value", "foobar");

Solution 3

One problem with this idea is that xpath "destroys" information.

There are an infinite number of xml trees that can match many xpaths. Now in some cases, like the example you give, there is an obvious minimal xml tree which matches your xpath, where you have a predicate that uses "=".

But for example if the predicate uses not equal, or any other arithmetic operator other than equal, an infinite number of possibilities exist. You could try to choose a "canonical" xml tree which requires, say, the fewest bits to represent.

Suppose for example you had xpath /feed/entry/content[@source > 0]. Now any xml tree of the appropriate structure in which node content had an attribute source whose value was > 0 would match, but there are an infinite number of numbers greater than zero. By choosing the "minimal" value, presumably 1, you could attempt to canonicalize your xml.

Xpath predicates can contain pretty arbitrary arithmetic expressions, so the general solution to this is quite difficult, if not impossible. You could imagine a huge equation in there, and it would have to be solved in reverse to come up with values that would match the equation; but since there can be an infinite number of matching values (as long as it's really an inequality not an equation), a canonical solution would need to be found.

Many expressions of other forms also destroy information. For example, an operator like "or" always destroys information. If you know that (X or Y) == 1, you don't know if X is 1, Y is 1, or both of them is 1; all you know for sure is that one of them is 1! Therefore if you have an expression using OR, you cannot tell which of the nodes or values that are inputs to the OR should be 1 (you can make an arbitrary choice and set both 1, as that will satisfy the expression for sure, as will the two choices in which only one of them is 1).

Now suppose there are several expressions in the xpath which refer to the same set of values. You then end up with a system of simultaneous equations or inequalities that can be virtually impossible to solve. Again, if you restrict the allowable xpath to a small subset of its full power, you can solve this problem. I suspect the fully general case is similar to the Turing halting problem, however; in this case, given an arbitrary program (the xpath), figure out a set of consistent data that matches the program, and is in some sense minimal.

Solution 4

Here is my version. Hope this also would help someone.

    public static void Main(string[] args)
    {

        XmlDocument doc = new XmlDocument();
        XmlNode rootNode = GenerateXPathXmlElements(doc, "/RootNode/FirstChild/SecondChild/ThirdChild");

        Console.Write(rootNode.OuterXml);

    }

    private static XmlDocument GenerateXPathXmlElements(XmlDocument xmlDocument, string xpath)
    {
        XmlNode parentNode = xmlDocument;

        if (xmlDocument != null && !string.IsNullOrEmpty(xpath))
        {
            string[] partsOfXPath = xpath.Split('/');


            string xPathSoFar = string.Empty;

            foreach (string xPathElement in partsOfXPath)
            {
                if(string.IsNullOrEmpty(xPathElement))
                    continue;

                xPathSoFar += "/" + xPathElement.Trim();

                XmlNode childNode = xmlDocument.SelectSingleNode(xPathSoFar);
                if(childNode == null)
                {
                    childNode = xmlDocument.CreateElement(xPathElement);
                }

                parentNode.AppendChild(childNode);

                parentNode = childNode;
            }
        }

        return xmlDocument;
    }

Solution 5

The C# version of Mark Miller's Java solution

    /// <summary>
    /// Makes the X path. Use a format like //configuration/appSettings/add[@key='name']/@value
    /// </summary>
    /// <param name="doc">The doc.</param>
    /// <param name="xpath">The xpath.</param>
    /// <returns></returns>
    public static XmlNode createNodeFromXPath(XmlDocument doc, string xpath)
    {
        // Create a new Regex object
        Regex r = new Regex(@"/+([\w]+)(\[@([\w]+)='([^']*)'\])?|/@([\w]+)");

        // Find matches
        Match m = r.Match(xpath);

        XmlNode currentNode = doc.FirstChild;
        StringBuilder currentPath = new StringBuilder();

        while (m.Success)
        {
            String currentXPath = m.Groups[0].Value;    // "/configuration" or "/appSettings" or "/add"
            String elementName = m.Groups[1].Value;     // "configuration" or "appSettings" or "add"
            String filterName = m.Groups[3].Value;      // "" or "key"
            String filterValue = m.Groups[4].Value;     // "" or "name"
            String attributeName = m.Groups[5].Value;   // "" or "value"

            StringBuilder builder = currentPath.Append(currentXPath);
            String relativePath = builder.ToString();
            XmlNode newNode = doc.SelectSingleNode(relativePath);

            if (newNode == null)
            {
                if (!string.IsNullOrEmpty(attributeName))
                {
                    ((XmlElement)currentNode).SetAttribute(attributeName, "");
                    newNode = doc.SelectSingleNode(relativePath);
                }
                else if (!string.IsNullOrEmpty(elementName))
                {
                    XmlElement element = doc.CreateElement(elementName);
                    if (!string.IsNullOrEmpty(filterName))
                    {
                        element.SetAttribute(filterName, filterValue);
                    }

                    currentNode.AppendChild(element);
                    newNode = element;
                }
                else
                {
                    throw new FormatException("The given xPath is not supported " + relativePath);
                }
            }

            currentNode = newNode;

            m = m.NextMatch();
        }

        // Assure that the node is found or created
        if (doc.SelectSingleNode(xpath) == null)
        {
            throw new FormatException("The given xPath cannot be created " + xpath);
        }

        return currentNode;
    }
Share:
49,753
Fred Strauss
Author by

Fred Strauss

Hands on Software Development Manager capable of managing while also designing, architecting, coding, mentoring and teaching. Long history of getting results from myself and my team through the entire software development life cycle. Capable of working on all aspects of a system from database through the middle tier to the User Interface with particular expertise in middle-tier application integration. Strength in creating and supporting highly robust mission critical applications and getting them done on hard deadlines.

Updated on July 09, 2022

Comments

  • Fred Strauss
    Fred Strauss almost 2 years

    Does anyone know of an existing means of creating an XML hierarchy programatically from an XPath expression?

    For example if I have an XML fragment such as:

    <feed>
        <entry>
            <data></data>
            <content></content>
        </entry>
    </feed>
    

    Given the XPath expression /feed/entry/content/@source I would have:

    <feed>
        <entry>
            <data></data>
            <content @source=""></content>
        </entry>
    </feed>
    

    I realize this is possible using XSLT but due to the dynamic nature of what I'm trying to accomplish a fixed transformation won't work.

    I am working in C# but if someone has a solution using some other language please chime in.

    Thanks for the help!