Scrapy grab div with multiple classes?
11,539
Solution 1
You should consider using a CSS selector for this part of your query.
http://doc.scrapy.org/en/latest/topics/selectors.html#when-querying-by-class-consider-using-css
from scrapy import Selector
sel = Selector(text='<div class="product product-small">I am a product!</div>')
print sel.css('.product').extract()
If you need to, you can chain CSS and XPath selectors, as in the example on that page.
Solution 2
This could be also solved with xpath
. You just needed to use contains()
:
//div[contains(concat(' ', normalize-space(@class), ' '), ' product ')]
Though, yes, the CSS selector
option is more compact and readable.
Related videos on Youtube
Author by
user1835351
Updated on June 05, 2022Comments
-
user1835351 almost 2 years
I am trying to grab div's with the class: 'product'. The problem is, some of the div's with class 'product' also have the class 'product-small'. So when I use
xpath('//div[@class='product']')
, it only captures the divs with one class and not multiple. How can I do this with scrapy?Example:
- Catches:
<div class='product'>
- Doesn't catch:
<div class='product product-small'>
- Catches:
-
Capi Etheriel almost 9 yearsyour xpath selector would also pick up elements with the
not-a-product
class. -
alecxe almost 9 years@barraponto yes, but the input to the current problem doesn't contain elements with
not-a-product
class. Thanks. -
oschlueter almost 8 yearsThe selector has been edited to perform exact matching of class names (c.f. doc.scrapy.org/en/1.1/topics/…)
-
sherlock about 4 yearsI have seen this doc,but I do not think it's useful,cause the content in div is dynamically-loaded,what can we do in this situation?