How to ignore namespace when selecting XML nodes with XPath
88,329
Use:
/*/*/*/*/*
[local-name()='REPORT_DATA'
or
local-name()='REPORT_HEADER'
]
Author by
lukegf
BY DAY: Software developer dabbling in C# and AngularJS BY NIGHT: Indie game developer ninja
Updated on July 08, 2022Comments
-
lukegf almost 2 years
I have to parse an XML document that looks like this:
<?xml version="1.0" encoding="UTF-8" ?> <m:OASISReport xmlns:m="http://oasissta.caiso.com/mrtu-oasis/xsd/OASISReport.xsd" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://oasissta.caiso.com/mrtu-oasis/xsd/OASISReport.xsd http://oasissta.caiso.com/mrtu-oasis/xsd/OASISReport.xsd"> <m:MessagePayload> <m:RTO> <m:name>CAISO</m:name> <m:REPORT_ITEM> <m:REPORT_HEADER> <m:SYSTEM>OASIS</m:SYSTEM> <m:TZ>PPT</m:TZ> <m:REPORT>AS_RESULTS</m:REPORT> <m:MKT_TYPE>HASP</m:MKT_TYPE> <m:UOM>MW</m:UOM> <m:INTERVAL>ENDING</m:INTERVAL> <m:SEC_PER_INTERVAL>3600</m:SEC_PER_INTERVAL> </m:REPORT_HEADER> <m:REPORT_DATA> <m:DATA_ITEM>NS_PROC_MW</m:DATA_ITEM> <m:RESOURCE_NAME>AS_SP26_EXP</m:RESOURCE_NAME> <m:OPR_DATE>2010-11-17</m:OPR_DATE> <m:INTERVAL_NUM>1</m:INTERVAL_NUM> <m:VALUE>0</m:VALUE> </m:REPORT_DATA>
The problem is that the namespace "http://oasissta.caiso.com/mrtu-oasis/xsd/OASISReport.xsd" can sometimes be different. I want to ignore it completely and just get my data from tag MessagePayload downstream.
The code I am using so far is:
String[] namespaces = new String[1]; String[] namespaceAliases = new String[1]; namespaceAliases[0] = "ns0"; namespaces[0] = "http://oasissta.caiso.com/mrtu-oasis/xsd/OASISReport.xsd"; File inputFile = new File(inputFileName); Map namespaceURIs = new HashMap(); // This query will return all of the ASR records. String xPathExpression = "/ns0:OASISReport /ns0:MessagePayload /ns0:RTO /ns0:REPORT_ITEM /ns0:REPORT_DATA"; xPathExpression += "|/ns0:OASISReport /ns0:MessagePayload /ns0:RTO /ns0:REPORT_ITEM /ns0:REPORT_HEADER"; // Load up the raw XML file. The parameters ignore whitespace and other // nonsense, // reduces DOM tree size. SAXReader reader = new SAXReader(); reader.setStripWhitespaceText(true); reader.setMergeAdjacentText(true); Document inputDocument = reader.read(inputFile); // Relate the aliases with the namespaces if (namespaceAliases != null && namespaces != null) { for (int i = 0; i < namespaceAliases.length; i++) { namespaceURIs.put(namespaceAliases[i], namespaces[i]); } } // Cache the expression using the supplied namespaces. XPath xPath = DocumentHelper.createXPath(xPathExpression); xPath.setNamespaceURIs(namespaceURIs); List asResultsNodes = xPath.selectNodes(inputDocument.getRootElement());
It works fine if the namespace never changes but that is obviously not the case. What do I need to do to make it ignore the namespace? Or if I know the set of all possible namespace values, how can I pass them all to the XPath instance?