Using BeautifulSoup to select div blocks within HTML

13,253

The correct use would be:

soup.find_all('div', class_="crBlock ")

By default, beautiful soup will return the entire tag, including contents. You can then do whatever you want to it if you store it in a variable. If you are only looking for one div, you can also use find() instead. For instance:

div = soup.find('div', class_="crBlock ")
print(div.find_all(text='foobar'))

Check out the documentation page for more info on all the filters you can use.

Share:
13,253
SMNALLY
Author by

SMNALLY

Student, trying to learn to code. Dam this forum can be hostile if your a sub par coder, please be nice everyone has to start somewhere.

Updated on June 04, 2022

Comments

  • SMNALLY
    SMNALLY almost 2 years

    I am trying to parse several div blocks using Beautiful Soup using some html from a website. However, I cannot work out which function should be used to select these div blocks. I have tried the following:

    import urllib2
    from bs4 import BeautifulSoup
    
    def getData():
    
        html = urllib2.urlopen("http://www.racingpost.com/horses2/results/home.sd?r_date=2013-09-22", timeout=10).read().decode('UTF-8')
    
        soup = BeautifulSoup(html)
    
        print(soup.title)
        print(soup.find_all('<div class="crBlock ">'))
    
    getData()
    

    I want to be able to select everything between <div class="crBlock "> and its correct end </div>. (Obviously there are other div tags but I want to select the block all the way down to the one that represents the end of this section of html.)