Python Scrapy can't extract text from class

10,166

CSS selectors don't normally have syntax to extract text content.

But Scrapy extends CSS selectors with the ::text pseudo-element, so you want to use cam.css('::text').extract() that should give you the same thing as cam.xpath('.//text()').extract()

Note: Scrapy also adds the ::attr(attribute_name) functional pseudo-element to extract attribute value (that's also not possible with standard CSS selectors)

Share:
10,166
buly
Author by

buly

Updated on June 18, 2022

Comments

  • buly
    buly almost 2 years

    Please look this html code:

    <header class="online">
                            <img src="http://static.flv.com/themes/h5/img/iconos/online.png"> <span>online</span> 
                <img src="http://static.flv.com/themes/h5/img/iconos/ojo16.png"> 428                        <p>xxfantasia</p>
    </header>
    

    I want to get the text inside (428, in this case). I used this:

            def parse(self, response):
                sel = Selector(response)
                cams = sel.css('header.online')
                for cam in cams:
                      print cam.css('text').extract()
    

    I think i have used the correct css selector, but i got empty result.

    Any help?