Modify XML file with xPath

18,228

Solution 1

If you want a solution without dependencies, you can do it with just DOM and without XPath/XSLT.

Node.getChildNodes|getNodeName / NodeList.* can be used to find the nodes, and Document.createElement|createTextNode, Node.appendChild to create new ones.

Then you can write your own, simple "XPath" interpreter, that creates missing nodes in the path like that:

public static void update(Document doc, String path, String def){
  String p[] = path.split("/");
  //search nodes or create them if they do not exist
  Node n = doc;
  for (int i=0;i < p.length;i++){
    NodeList kids = n.getChildNodes();
    Node nfound = null;
    for (int j=0;j<kids.getLength();j++) 
      if (kids.item(j).getNodeName().equals(p[i])) {
    nfound = kids.item(j);
    break;
      }
    if (nfound == null) { 
      nfound = doc.createElement(p[i]);
      n.appendChild(nfound);
      n.appendChild(doc.createTextNode("\n")); //add whitespace, so the result looks nicer. Not really needed
    }
    n = nfound;
  }
  NodeList kids = n.getChildNodes();
  for (int i=0;i<kids.getLength();i++)
    if (kids.item(i).getNodeType() == Node.TEXT_NODE) {
      //text node exists
      kids.item(i).setNodeValue(def); //override
      return;
    }

  n.appendChild(doc.createTextNode(def));    
}

Then, if you only want to update text() nodes, you can use it as:

DocumentBuilderFactory domFactory = DocumentBuilderFactory.newInstance();
Document doc = domFactory.newDocumentBuilder().parse(file.getAbsolutePath());

update(doc, "configuration/param1", "4.0");
update(doc, "configuration/param2", "asdf");
update(doc, "configuration/test/param3", "true");

Solution 2

Here is a simple XSLT solution:

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
 <xsl:output omit-xml-declaration="yes" indent="yes"/>
 <xsl:strip-space elements="*"/>

 <xsl:template match="node()|@*">
     <xsl:copy>
       <xsl:apply-templates select="node()|@*"/>
     </xsl:copy>
 </xsl:template>

 <xsl:template match="param1/text()">4.0</xsl:template>

 <xsl:template match="/*">
  <xsl:copy>
   <xsl:apply-templates select="@*|node()"/>
     <param2>asdf</param2>
     <test><param3>true</param3></test>
  </xsl:copy>
 </xsl:template>
</xsl:stylesheet>

When this transformation is applied on the provided XML document:

<configuration>
    <param0>true</param0>
    <param1>1.0</param1>
</configuration>

the wanted, correct result is produced:

<configuration>
   <param0>true</param0>
   <param1>4.0</param1>
   <param2>asdf</param2>
   <test><param3>true</param3></test>
</configuration>

Do Note:

An XSLT transformation never "updates in-place". It always creates a new result tree. Therefore, if one wants to modify the same file, typically the result of the transformation is saved under another name, then the original file is deleted and the result is renamed to have the original name.

Solution 3

I've created a small project for using XPATH to create/update XML: https://github.com/shenghai/xmodifier the code to change your xml is like:

DocumentBuilderFactory builderFactory = DocumentBuilderFactory.newInstance();
builderFactory.setNamespaceAware(true);
DocumentBuilder documentBuilder = builderFactory.newDocumentBuilder();
Document document = documentBuilder.parse(xmlfile);

XModifier modifier = new XModifier(document);
modifier.addModify("/configuration/param1", "asdf");
modifier.addModify("/configuration/param2", "asdf");
modifier.addModify("/configuration/test/param3", "true");
modifier.modify();

Solution 4

I would point you to a new/novel way of doing what you described, by using VTD-XML... there are numerous reasons why VTD-XML is far better than all other solutions provided for this question... here are a few links ...

dfs

   import com.ximpleware.*;
    import java.io.*;
    public class modifyXML {
            public static void main(String[] s) throws VTDException, IOException{
                VTDGen vg = new VTDGen();
                if (!vg.parseFile("input.xml", false))
                    return;
                VTDNav vn = vg.getNav();
                AutoPilot ap = new AutoPilot(vn);
                ap.selectXPath("/configuration/param1/text()");
                XMLModifier xm = new XMLModifier(vn);
                // using XPath
                int i=ap.evalXPath();
                if(i!=-1){
                    xm.updateToken(i, "4.0");
                }
                String s1 ="<param2>asdf</param2>/n<test>/n<param3>true</param3>/n</test>";
                xm.insertAfterElement(s1);
                xm.output("output.xml");
            }
        }
Share:
18,228
brimborium
Author by

brimborium

Updated on June 18, 2022

Comments

  • brimborium
    brimborium almost 2 years

    I want to modify an existing XML file using xPath. If the node doesn't exist, it should be created (along with it's parents if neccessary). An example:

    <?xml version="1.0" encoding="UTF-8"?>
    <configuration>
      <param0>true</param0>
      <param1>1.0</param1>
    </configuration>
    

    And here are a couple of xPaths I want to insert/modify:

    /configuration/param1/text()         -> 4.0
    /configuration/param2/text()         -> "asdf"
    /configuration/test/param3/text()    -> true
    

    The XML file should look like this afterwards:

    <?xml version="1.0" encoding="UTF-8"?>
    <configuration>
      <param0>true</param0>
      <param1>4.0</param1>
      <param2>asdf</param2>
      <test>
        <param3>true</param3>
      </test>
    </configuration>
    

    I tried this:

    import javax.xml.parsers.DocumentBuilderFactory;
    import javax.xml.xpath.XPath;
    import javax.xml.xpath.XPathConstants;
    import javax.xml.xpath.XPathFactory;
    import javax.xml.transform.Transformer;
    import javax.xml.transform.TransformerFactory;
    import javax.xml.transform.dom.DOMSource;
    import javax.xml.transform.stream.StreamResult;
    
    import org.w3c.dom.Document;
    import org.w3c.dom.Node;
    import org.w3c.dom.NodeList;
    
    try {
      DocumentBuilderFactory domFactory = DocumentBuilderFactory.newInstance();
      Document doc = domFactory.newDocumentBuilder().parse(file.getAbsolutePath());
      XPath xpath = XPathFactory.newInstance().newXPath();
    
      String xPathStr = "/configuration/param1/text()";
      Node node = ((NodeList) xpath.compile(xPathStr).evaluate(doc, XPathConstants.NODESET)).item(0);
      System.out.printf("node value: %s\n", node.getNodeValue());
      node.setNodeValue("4.0");
    
      TransformerFactory transformerFactory = TransformerFactory.newInstance();
      Transformer transformer = transformerFactory.newTransformer();
      transformer.transform(new DOMSource(doc), new StreamResult(file));
    } catch (Exception e) {
      e.printStackTrace();
    }
    

    The node is changed in the file after running this code. Exactly what I wanted. But if I use one of the below paths, node is null (and therefore a NullPointerException is thrown):

    /configuration/param2/text()
    /configuration/test/param3/text()
    

    How can I change this code so that the node (and non existing parent nodes as well) are created?

    EDIT: Ok, to clarify: I have a set of parameters that I want to save to XML. During development, this set can change (some parameters get added, some get moved, some get removed). So I basically want to have a function to write the current set of parameters to an already existing file. It should override the parameters that already exist in the file, add new parameters and leave old parameters in there.

    The same for reading, I could just have the xPath or some other coordinates and get the value from the XML. If it doesn't exist, it returns the empty string.

    I don't have any constraints on how to implement it, xPath, DOM, SAX, XSLT... It should just be easy to use once the functionality is written (like BeniBela's solution).

    So if I have the following parameters to set:

    /configuration/param1/text()         -> 4.0
    /configuration/param2/text()         -> "asdf"
    /configuration/test/param3/text()    -> true
    

    the result should be the starting XML + those parameters. If they already exist at that xPath, they get replaced, otherwise they get inserted at that point.

  • brimborium
    brimborium over 11 years
    Thanks for the example. As far as I understand, all existing nodes are copied, then param1's value is changed to 4.0 and then param2/3 are inserted. But I still have to check if the node already exists and then either update or insert it, right?
  • brimborium
    brimborium over 11 years
    Hm, this just shifts my problem. I still have to create a file (it's just a XSLT file now) and have to check for every node if it's already available in the given XML file. Maybe I don't get the thing that makes this the simpler solution than using DOM?
  • Dimitre Novatchev
    Dimitre Novatchev over 11 years
    @brimborium, The xslt code is just a few lines (maybe twice as short as your current code). For more complex problems this ratio will be tens of times or even hundreds of times shorter. The XSLT code is also simpler (in most cases no explicit conditional instructions are necessary), extensible (due to templates and template matching), and maintainable (due to all of the previous facts). So, to summarize, one can clearly see the advantage of using XSLT when the problem is just a little-bit more complex. Look at the problems asked in the "xslt" tag and try to solve them without XSLT :)
  • brimborium
    brimborium over 11 years
    +1 I can see that XSLT is very strong and I am very interested in using it. But how would you go about solving my problem? Because your solution will fail, if the input XML is slightly changed (let's say, param2 or an empty test already exists or param1doesn't exist yet). Would you solve that issue within XSLT or at the creation of the XSLT?
  • brimborium
    brimborium over 11 years
    Also, for your argument about the size of the code: You would have to include the Java code to generate the XSLT code too, which makes the XSLT solution a lot bigger... (especially for all the recursive creation of nodes only when neccessary).
  • Dimitre Novatchev
    Dimitre Novatchev over 11 years
    @brimborium, You don't have a well-defined problem. Please, edit the question and give us one or more XML documents and the exact wanted result each of them should be transformed to. If you don't miss a significant case, people will give you a solution that covers all the significant cases. If you fail to provide some significant case, then don't complain that people haven't been able to guess what was (not) in your head at that moment.
  • bdoughan
    bdoughan over 11 years
    The JDK/JRE contains XPath (javax.xml.xpath) and XSLT (javax.xml.transform) APIs, so those approaches don't introduce any dependencies.
  • BeniBela
    BeniBela over 11 years
    @BlaiseDoughan: but judging from his comments brimborium does not seem to like the APIs so much
  • brimborium
    brimborium over 11 years
    Thanks BeniBela. For me, that would be a clean solution and it does exactly what I want. You are right, I am not entirely convinced by XSLT, but that's maybe also because I don't seem to grasp the concept of it yet. I will also look into that.
  • brimborium
    brimborium over 11 years
    Ok, I will add a summary and a little background as well as some more examples tonight. Thanks for your time, it really opened up new possibilities for me. :)
  • brimborium
    brimborium over 11 years
    I edited the question (at the end) to clarify. I hope it is clear no, otherwise I will add some more examples.
  • Dimitre Novatchev
    Dimitre Novatchev over 11 years
    @brimborium, Neither XSLT 1.0 or XSLT 2.0 have the capability of dynamic evaluation of a string that happens to contain a syntactically valid XPath expression. In XSLT 3.0 there will be a new instruction xsl:evaluate (w3.org/TR/xslt-30/#element-evaluate) doing exactly that, so your general problem can be solved in pure XSLT 3.0 -- and may be an early implementation such as Saxon 9.4 already supports that.
  • brimborium
    brimborium over 11 years
    I see. I will definitely look into XSLT as it seems to be quite strong. I think I will go with BeniBela's solution for the moment until I've upgraded my XSLT knowledge. ;)
  • brimborium
    brimborium over 11 years
    I decided to use your solution and meanwhile upgrade my knowledge about XSLT. Because Dimitre's answer impressed me as well. ;) Thanks for your help.
  • brimborium
    brimborium over 9 years
    That looks very interesting, thank you for sharing this. I will have a look at it.
  • brimborium
    brimborium about 8 years
    Thanks for this alternative. Can you go into the reasons why this is better (apart from the fact that the code looks very clean to me, which I like)? The links are nice, but usually not the best way for SO as they tend to break over time.
  • vtd-xml-author
    vtd-xml-author about 8 years
    If I just list the reasons without any detailed explanation, it will sound like cheap spamming ... therefore I think it might help you if you could at least skim over the third URL reference list above.... dom sax or stax have got a lot of things against them,, as it will become clear to you