XPath Query: get attribute href from a tag
Solution 1
For the following HTML document:
<html>
<body>
<a href="http://www.example.com">Example</a>
<a href="http://www.stackoverflow.com">SO</a>
</body>
</html>
The xpath query /html/body//a/@href
(or simply //a/@href
) will return:
http://www.example.com http://www.stackoverflow.com
To select a specific instance use /html/body//a[N]/@href
,
$ /html/body//a[2]/@href http://www.stackoverflow.com
To test for strings contained in the attribute and return the attribute itself place the check on the tag not on the attribute:
$ /html/body//a[contains(@href,'example')]/@href http://www.example.com
Mixing the two:
$ /html/body//a[contains(@href,'com')][2]/@href http://www.stackoverflow.com
Solution 2
The answer shared by @mockinterface is correct. Although I would like to add my 2 cents to it.
If someone is using frameworks like scrapy
the you will have to use /html/body//a[contains(@href,'com')][2]/@href
along with get() like this:
response.xpath('//a[contains(@href,'com')][2]/@href').get()
user3239713
Updated on January 05, 2020Comments
-
user3239713 over 4 years
I want to use XPath to get the
href
attribute from ana
-tag, but it has two occurrences within the same file. How am I getting along? I need to check IF there is anhref
attribute with value $street/object, I have got this code and it does not work:$product_photo = $xpath->query("//a[contains(@href,'{$object_street}fotos/')][1]"); $product_360 = $xpath->query("//a[contains(@href,'{$object_street}360-fotos/')][1]"); $product_blueprint = $xpath->query("//a[contains(@href,'{$object_street}plattegrond/')][1]"); $product_video = $xpath->query("//a[contains(@href,'{$object_street}video/')][1]");
It does not return anything at all. Who can help me out?
-
user3239713 over 10 yearsEDIT: How could I check for a specific href attribute? Shall I then use
/html/body//a[1]/@href='{$object_street}/x'
? -
user3239713 over 10 yearsThank you a lot for the effort! Unfortunately, I am still having trouble, I suppose it is not the query that is wrong. Do you mind taking a look at the procedural code for me and putting me on the right track? Because, if so, I will post the code.
-
mockinterface over 10 yearsMake sure your query evaluates the {$object_street} properly, maybe put it in a string first, as in "string s = //a[contains(@href,'{$object_street}fotos/')][1]/@href" and check that
s
looks allright. -
user3239713 over 10 yearsI have put my question here, but nobody is responding to it. So maybe you could take a look at it, please?
-
user3239713 over 10 yearsOh, I am sorry for not including the link: stackoverflow.com/questions/21406694/…
-
mockinterface over 10 yearsApologies, I am not well versed enough in php to comment on your problem. The question looks too long to me though, maybe you could distill it to a small sample html (as in my example) and the essence of php code that fails? It will make easier on SO users to read and answer.
-
user3239713 over 10 yearsThe problem is that I am not sure about where the code fails, whether it is about a conditional, or the XPath query or something else, haha. So I find it hard to distill it.
-
Jeú Casulo over 5 yearsIt is returning an array not the specific string value
-
chovy over 2 yearsfor some reason i don't get the url back I get
<link href="http://example.com" />
instead of<link>{$link}</link>