Using BeautifulSoup to select div blocks within HTML
The correct use would be:
soup.find_all('div', class_="crBlock ")
By default, beautiful soup will return the entire tag, including contents. You can then do whatever you want to it if you store it in a variable. If you are only looking for one div, you can also use find()
instead. For instance:
div = soup.find('div', class_="crBlock ")
print(div.find_all(text='foobar'))
Check out the documentation page for more info on all the filters you can use.
SMNALLY
Student, trying to learn to code. Dam this forum can be hostile if your a sub par coder, please be nice everyone has to start somewhere.
Updated on June 04, 2022Comments
-
SMNALLY almost 2 years
I am trying to parse several div blocks using Beautiful Soup using some html from a website. However, I cannot work out which function should be used to select these div blocks. I have tried the following:
import urllib2 from bs4 import BeautifulSoup def getData(): html = urllib2.urlopen("http://www.racingpost.com/horses2/results/home.sd?r_date=2013-09-22", timeout=10).read().decode('UTF-8') soup = BeautifulSoup(html) print(soup.title) print(soup.find_all('<div class="crBlock ">')) getData()
I want to be able to select everything between
<div class="crBlock ">
and its correct end</div>
. (Obviously there are other div tags but I want to select the block all the way down to the one that represents the end of this section of html.)