BeautifulSoup 4, findNext() function
34,771
Use find_all
instead of findNext
:
import bs4 as bs
content = '''\
<tr>
<td id="freistoesse">Giraffe</td>
<td>14</td>
<td>7</td>
</tr>'''
soup = bs.BeautifulSoup(content)
for td in soup.find('td', text='Giraffe').parent.find_all('td'):
print(td.text)
yields
Giraffe
14
7
Or, you could use find_next_siblings
(also known as fetchNextSiblings
):
for td in soup.find(text='Giraffe').parent.find_next_siblings():
print(td.text)
yields
14
7
Explanation:
Note that soup.find(text='Giraffe')
returns a NavigableString.
In [30]: soup.find(text='Giraffe')
Out[30]: u'Giraffe'
To get the associated td
tag, use
In [31]: soup.find('td', text='Giraffe')
Out[31]: <td id="freistoesse">Giraffe</td>
or
In [32]: soup.find(text='Giraffe').parent
Out[32]: <td id="freistoesse">Giraffe</td>
Once you have the td
tag, you could use find_next_siblings
:
In [35]: soup.find(text='Giraffe').parent.find_next_siblings()
Out[35]: [<td>14</td>, <td>7</td>]
PS. BeautifulSoup has added method names that use underscores instead of CamelCase. They do the same thing, but comform to the PEP8 style guide recommendations. Thus, prefer find_next_siblings
over fetchNextSiblings
.
Author by
nutship
Updated on March 20, 2020Comments
-
nutship about 4 years
I'm playing with BeautifulSoup 4 and I have this html code:
</tr> <tr> <td id="freistoesse">Giraffe</td> <td>14</td> <td>7</td> </tr>
I want to match both values between
<td>
tags so here 14 and 7.I tried this:
giraffe = soup.find(text='Giraffe').findNext('td').text
but this only matches
14
. How can I match both values with this function? -
nutship about 11 yearsthanks, dunno why #1 method raised me an error:
AttributeError: 'NavigableString' object has no attribute 'find_all'
ideas? -
unutbu about 11 years
soup.find(text='Giraffe')
returns aNavigableString
. Usingsoup.find('td', text='Giraffe')
gives you thetd
tag instead. From there, callfetchNextSibling()
. -
nutship about 11 yearsThanks a ton for this quick help!