Using BeautifulSoup to select div blocks within HTML

python html python-2.7 beautifulsoup urllib2

13,253

The correct use would be:

soup.find_all('div', class_="crBlock ")

By default, beautiful soup will return the entire tag, including contents. You can then do whatever you want to it if you store it in a variable. If you are only looking for one div, you can also use find() instead. For instance:

div = soup.find('div', class_="crBlock ")
print(div.find_all(text='foobar'))

Check out the documentation page for more info on all the filters you can use.

13,253

Author by

SMNALLY

Student, trying to learn to code. Dam this forum can be hostile if your a sub par coder, please be nice everyone has to start somewhere.

Updated on June 04, 2022

Comments

SMNALLY almost 2 years
I am trying to parse several div blocks using Beautiful Soup using some html from a website. However, I cannot work out which function should be used to select these div blocks. I have tried the following:
```
import urllib2
from bs4 import BeautifulSoup

def getData():

    html = urllib2.urlopen("http://www.racingpost.com/horses2/results/home.sd?r_date=2013-09-22", timeout=10).read().decode('UTF-8')

    soup = BeautifulSoup(html)

    print(soup.title)
    print(soup.find_all('<div class="crBlock ">'))

getData()
```
I want to be able to select everything between <div class="crBlock "> and its correct end </div>. (Obviously there are other div tags but I want to select the block all the way down to the one that represents the end of this section of html.)