How to use XPath contains() for specific text?

22,999

Solution 1

Be careful of the contains() function.

It is a common mistake to use it to test if an element contains a value. What it really does is test if a string contains a substring. So, td[contains(.,'8')] takes the string value of td (.) and tests if it contains any '8' substrings. This might be what you want, but often it is not.

This XPath,

//td[.='8']

will select all td elements whose string-value equals 8.

Alternatively, this XPath,

//td[normalize-space()='8']

will select all td elements whose normalize-space() string-value equals 8. (The normalize-space() XPath function strips leading and trailing whitespace and replaces sequences of whitespace characters with a single space.)

Notes:

  • Both will work even if the 8 is inside of another element such as a a, b, span, div, etc.
  • Both will not match <td>gr8t</td>, <td>123456789</td>, etc.
  • Using normalize-space() will ignore leading or trailing whitespace surrounding the 8.

See also:

Solution 2

Try the following xpath, which will select the whole text contents rather than partial matches:

//table//td[text()='8']

Edit: Your example HTML has a tags inside the td elements, so the following will work:

//table//td/a[text()="8"]

See example in php here: https://3v4l.org/56SBn

Share:
22,999
MasterJoe
Author by

MasterJoe

Hello employers ! I am a 21 year old guy with 30 years of work experience ;) Please don't contact me if you need someone with 31 years of experience or more. I doubt I can handle that. My favorite style of interview - Develop non-trivial algorithms in a 1 hour phone screen interview. Please ask me happy path tests only or better yet, use only one test input. Let's spend the first 30 minutes in chit chat to get to know each other. My favorite coworkers - The ones who ask you to read an entire book, when you occasionally ask them a tiny question about a language or technology which you will hardly use at work. My favorite stack overflow users - The ones who down vote or delete answers or questions without any explanation, especially those contributions which are deemed useful or up voted by other users. It would be fun to have such people as coworkers.

Updated on July 09, 2022

Comments

  • MasterJoe
    MasterJoe almost 2 years

    Say we have an HTML table which basically looks like this:

    2|1|28|9|
    3|8|5|10|
    18|9|8|0|
    

    I want to select the cells which contain only 8 and nothing else, that is, only 2nd cell of row2 and 3rd cell of row3.

    This is what I tried: //table//td[contains(.,'8')]. It gives me all cells which contain 8. So, I get unwanted values 28 and 18 as well.

    How do I fix this?

    EDIT: Here is a sample table if you want to try your xpath. Use the calendar on the left side-https://sfbay.craigslist.org/sfc/

  • MasterJoe
    MasterJoe over 7 years
    This does not work. Please try the xpath in the link I have now added to the question. Thanks.