xpath to get Node containing text
78,676
Solution 1
Since you need all textNodes only which contain the text Yahoo, use the following XPath.
//text()[contains(., 'Yahoo')]
This should return you all the textNodes only which contains Yahoo (case-sensitive) in it.
Solution 2
Your XML is malformed. </content></doc></story>
should be </content></story></doc>
.
Apart from that, the XPath you would want is
/doc/story/content//*[contains(., 'Yahoo')]
(select any descendant of <content>
which contains the text "Yahoo" -- this will select the <p>
)
Related videos on Youtube
Author by
Vjy
Updated on October 09, 2020Comments
-
Vjy over 3 years
I tried to search for nodes containing text 'Yahoo' under '/doc/story/content', it returns 'content' node, but I need exact text node that contains 'Yahoo' or it's parent
<doc> <story> <content id="201009281450332423"> <ul>MSW NYNES NYPG1 DILMA</ul> <p> <k> Yahoo, made </k> it nice </p> <p> <author>-v-</author> </p> </content> </story> </doc>
Xpath:
"/doc/story/content[contains(., 'Yahoo')]"
-
Vjy about 13 yearsThis works great if it's one level down, How to make it work for multi-nested tags?
-
Waihon Yew about 13 years@Vjy: I 'm not sure what you mean. Can you give an example?
-
Vjy about 13 yearsUpdated the above xml with additional tag <K>, it should select K instead of P tag. this is just example, the text node can be n level deep.
-
Emiliano Poggi about 13 years@Vjy: this does exactly what you asked for.
-
Jason S over 10 yearstext() is a node test not a string. contains() expects strings. See stackoverflow.com/a/9493870/695671 Your solution may appear to work, but I have a case with text nodes within text nodes in which case it fails.
-
Waihon Yew over 10 years@JasonS: That situation did not cross my mind (how did you manage to do it? programmatically?). I have corrected the answer accordingly. Thank you for pointing that out, I feel I learned something new.
-
Jason S over 10 years@Jon I did it as in your updated answer. I am getting content from text nodes in odt files using PHP SimpleXMLElement. The odt often has paragraphs with tabs and spaces represented like
<text:p Hello<text:s/><text:tab/>Jon</text:p>
, in which case searching usingcontains(text(),"Jon")
will fail, butcontains(.,"Jon")
will work. -
Nakilon over 8 yearsWhat it the difference between this answer and @Jon's?
-
Stefan Steiger about 7 yearsCase insensitive: //text()[contains(translate(., 'ABCDEFGHIJKLMNOPQRSTUVWXYZÄÖÜÉÈÊÀÁÂÒÓÔÙÚÛÇÅÏÕÑŒ', 'abcdefghijklmnopqrstuvwxyzäöüéèêàáâòóôùúûçåïõñœ'),'yahoo')]