Java: how to locate an element via xpath string on org.w3c.dom.document
55,255
Try this:
//obtain Document somehow, doesn't matter how
DocumentBuilder b = DocumentBuilderFactory.newInstance().newDocumentBuilder();
org.w3c.dom.Document doc = b.parse(new FileInputStream("page.html"));
//Evaluate XPath against Document itself
XPath xPath = XPathFactory.newInstance().newXPath();
NodeList nodes = (NodeList)xPath.evaluate("/html/body/p/div[3]/a",
doc, XPathConstants.NODESET);
for (int i = 0; i < nodes.getLength(); ++i) {
Element e = (Element) nodes.item(i);
}
With the following page.html
file:
<html>
<head>
</head>
<body>
<p>
<div></div>
<div></div>
<div><a>link</a></div>
</p>
</body>
</html>
Author by
KJW
It's about how hard you can take a hit and still move forward.
Updated on August 16, 2022Comments
-
KJW almost 2 years
How do you quickly locate element/elements via xpath string on a given org.w3c.dom.document? there seems to be no
FindElementsByXpath()
method. For example/html/body/p/div[3]/a
I found that recursively iterating through all the child node levels to be quite slow when there are lot of elements of same name. Any suggestions?
I cannot use any parser or library, must work with w3c dom document only.
-
Tomasz Nurkiewicz about 13 yearsIn my code example
doc
is oforg.w3c.dom.Document
type. If you already have an instance ofDocument
, just use two last lines of my code and that's it! P.S.: Why the downvote? -
KJW about 13 yearsthis returns text. I need domelement or domelements.
-
Tomasz Nurkiewicz about 13 yearsSee my edit (introduction of
XPathConstants.NODESET
parameter) - now it returnsNodeList
. Also have a look at other constants as well. -
KJW about 13 yearsThank you this is a great answer.
-
Sudip7 over 9 years@Tomasz Nukiewicz , can you please look into my implementation. I know I am not the the questioner and itz a different question, but I took the reference from your answer, so I hope u can help me,stackoverflow.com/questions/26389376/…
-
burcakulug about 9 yearsI think you don't need to do
doc.getDocumentElement()
, you should be able to run the xpath onorg.w3c.dom.Document
type directly.