Get contents of div by id with BeautifulSoup
33,237
Solution 1
Join the elements of div tag's .contents
:
from bs4 import BeautifulSoup
data = """
<div id='theDiv'>
<p>div content</p>
<p>div stuff</p>
<p>div thing</p>
</div>
"""
soup = BeautifulSoup(data)
div = soup.find('div', id='theDiv')
print ''.join(map(str, div.contents))
Prints:
<p>div content</p>
<p>div stuff</p>
<p>div thing</p>
Solution 2
Since version 4.0.1 there's a function decode_contents()
:
>>> soup = BeautifulSoup("""
<div id='theDiv'>
<p>div content</p>
<p>div stuff</p>
<p>div thing</p>
""")
>>> print(soup.div.decode_contents())
<p>div content</p>
<p>div stuff</p>
<p>div thing</p>
More details in a solution to this question: https://stackoverflow.com/a/18602241/237105
Author by
user8028
Updated on July 10, 2020Comments
-
user8028 almost 4 years
I am using python2.7.6, urllib2, and BeautifulSoup
to extract html from a website and store in a variable.
How can I show just the html contents of a
div
with an id by using beautifulsoup?<div id='theDiv'> <p>div content</p> <p>div stuff</p> <p>div thing</p>
would be
<p>div content</p> <p>div stuff</p> <p>div thing</p>