xpath to get Node containing text

78,676

Solution 1

Since you need all textNodes only which contain the text Yahoo, use the following XPath.

//text()[contains(., 'Yahoo')]

This should return you all the textNodes only which contains Yahoo (case-sensitive) in it.

Solution 2

Your XML is malformed. </content></doc></story> should be </content></story></doc>.

Apart from that, the XPath you would want is

/doc/story/content//*[contains(., 'Yahoo')]

(select any descendant of <content> which contains the text "Yahoo" -- this will select the <p>)

Share:
78,676

Related videos on Youtube

Vjy
Author by

Vjy

Updated on October 09, 2020

Comments

  • Vjy
    Vjy over 3 years

    I tried to search for nodes containing text 'Yahoo' under '/doc/story/content', it returns 'content' node, but I need exact text node that contains 'Yahoo' or it's parent

    <doc>
        <story>
            <content id="201009281450332423">
                <ul>MSW NYNES NYPG1 DILMA</ul>
                <p> <k> Yahoo, made </k> it nice </p>
                <p>
                   <author>-v-</author>
                </p>
            </content>
        </story>
    </doc>
    

    Xpath: "/doc/story/content[contains(., 'Yahoo')]"

  • Vjy
    Vjy about 13 years
    This works great if it's one level down, How to make it work for multi-nested tags?
  • Waihon Yew
    Waihon Yew about 13 years
    @Vjy: I 'm not sure what you mean. Can you give an example?
  • Vjy
    Vjy about 13 years
    Updated the above xml with additional tag <K>, it should select K instead of P tag. this is just example, the text node can be n level deep.
  • Emiliano Poggi
    Emiliano Poggi about 13 years
    @Vjy: this does exactly what you asked for.
  • Jason S
    Jason S over 10 years
    text() is a node test not a string. contains() expects strings. See stackoverflow.com/a/9493870/695671 Your solution may appear to work, but I have a case with text nodes within text nodes in which case it fails.
  • Waihon Yew
    Waihon Yew over 10 years
    @JasonS: That situation did not cross my mind (how did you manage to do it? programmatically?). I have corrected the answer accordingly. Thank you for pointing that out, I feel I learned something new.
  • Jason S
    Jason S over 10 years
    @Jon I did it as in your updated answer. I am getting content from text nodes in odt files using PHP SimpleXMLElement. The odt often has paragraphs with tabs and spaces represented like <text:p Hello<text:s/><text:tab/>Jon</text:p>, in which case searching using contains(text(),"Jon") will fail, but contains(.,"Jon") will work.
  • Nakilon
    Nakilon over 8 years
    What it the difference between this answer and @Jon's?
  • Stefan Steiger
    Stefan Steiger about 7 years
    Case insensitive: //text()[contains(translate(., 'ABCDEFGHIJKLMNOPQRSTUVWXYZÄÖÜÉÈÊÀÁÂÒÓÔÙÚÛÇÅÏÕÑŒ', 'abcdefghijklmnopqrstuvwxyzäöüéèêàáâòóôùúûçåïõñœ'),'yahoo')]