How to ignore namespace when selecting XML nodes with XPath

88,329

Use:

/*/*/*/*/*
        [local-name()='REPORT_DATA' 
       or 
         local-name()='REPORT_HEADER'
        ]
Share:
88,329
lukegf
Author by

lukegf

BY DAY: Software developer dabbling in C# and AngularJS BY NIGHT: Indie game developer ninja

Updated on July 08, 2022

Comments

  • lukegf
    lukegf almost 2 years

    I have to parse an XML document that looks like this:

     <?xml version="1.0" encoding="UTF-8" ?> 
     <m:OASISReport xmlns:m="http://oasissta.caiso.com/mrtu-oasis/xsd/OASISReport.xsd" 
                    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
                    xsi:schemaLocation="http://oasissta.caiso.com/mrtu-oasis/xsd/OASISReport.xsd http://oasissta.caiso.com/mrtu-oasis/xsd/OASISReport.xsd">
      <m:MessagePayload>
       <m:RTO>
        <m:name>CAISO</m:name> 
        <m:REPORT_ITEM>
         <m:REPORT_HEADER>
          <m:SYSTEM>OASIS</m:SYSTEM> 
          <m:TZ>PPT</m:TZ> 
          <m:REPORT>AS_RESULTS</m:REPORT> 
          <m:MKT_TYPE>HASP</m:MKT_TYPE> 
          <m:UOM>MW</m:UOM> 
          <m:INTERVAL>ENDING</m:INTERVAL> 
          <m:SEC_PER_INTERVAL>3600</m:SEC_PER_INTERVAL> 
         </m:REPORT_HEADER>
         <m:REPORT_DATA>
          <m:DATA_ITEM>NS_PROC_MW</m:DATA_ITEM> 
          <m:RESOURCE_NAME>AS_SP26_EXP</m:RESOURCE_NAME> 
          <m:OPR_DATE>2010-11-17</m:OPR_DATE> 
          <m:INTERVAL_NUM>1</m:INTERVAL_NUM> 
          <m:VALUE>0</m:VALUE> 
         </m:REPORT_DATA>
    

    The problem is that the namespace "http://oasissta.caiso.com/mrtu-oasis/xsd/OASISReport.xsd" can sometimes be different. I want to ignore it completely and just get my data from tag MessagePayload downstream.

    The code I am using so far is:

    String[] namespaces = new String[1];
      String[] namespaceAliases = new String[1];
    
      namespaceAliases[0] = "ns0";
      namespaces[0] = "http://oasissta.caiso.com/mrtu-oasis/xsd/OASISReport.xsd";
    
      File inputFile = new File(inputFileName);
    
      Map namespaceURIs = new HashMap();
    
      // This query will return all of the ASR records.
      String xPathExpression = "/ns0:OASISReport
                                 /ns0:MessagePayload
                                  /ns0:RTO
                                   /ns0:REPORT_ITEM
                                    /ns0:REPORT_DATA";
      xPathExpression += "|/ns0:OASISReport
                            /ns0:MessagePayload
                             /ns0:RTO
                              /ns0:REPORT_ITEM
                               /ns0:REPORT_HEADER";
    
      // Load up the raw XML file. The parameters ignore whitespace and other
      // nonsense,
      // reduces DOM tree size.
      SAXReader reader = new SAXReader();
      reader.setStripWhitespaceText(true);
      reader.setMergeAdjacentText(true);
      Document inputDocument = reader.read(inputFile);
    
      // Relate the aliases with the namespaces
      if (namespaceAliases != null && namespaces != null)
      {
       for (int i = 0; i < namespaceAliases.length; i++)
       {
        namespaceURIs.put(namespaceAliases[i], namespaces[i]);
       }
      }
    
      // Cache the expression using the supplied namespaces.
      XPath xPath = DocumentHelper.createXPath(xPathExpression);
      xPath.setNamespaceURIs(namespaceURIs);
    
      List asResultsNodes = xPath.selectNodes(inputDocument.getRootElement());
    

    It works fine if the namespace never changes but that is obviously not the case. What do I need to do to make it ignore the namespace? Or if I know the set of all possible namespace values, how can I pass them all to the XPath instance?