BeautifulSoup 4, findNext() function

34,771

Use find_all instead of findNext:

import bs4 as bs
content = '''\
<tr>
<td id="freistoesse">Giraffe</td>
<td>14</td>
<td>7</td>
</tr>'''
soup = bs.BeautifulSoup(content)

for td in soup.find('td', text='Giraffe').parent.find_all('td'):
    print(td.text)

yields

Giraffe
14
7

Or, you could use find_next_siblings (also known as fetchNextSiblings):

for td in soup.find(text='Giraffe').parent.find_next_siblings():
    print(td.text)

yields

14
7

Explanation:

Note that soup.find(text='Giraffe') returns a NavigableString.

In [30]: soup.find(text='Giraffe')
Out[30]: u'Giraffe'

To get the associated td tag, use

In [31]: soup.find('td', text='Giraffe')
Out[31]: <td id="freistoesse">Giraffe</td>

or

In [32]: soup.find(text='Giraffe').parent
Out[32]: <td id="freistoesse">Giraffe</td>

Once you have the td tag, you could use find_next_siblings:

In [35]: soup.find(text='Giraffe').parent.find_next_siblings()
Out[35]: [<td>14</td>, <td>7</td>]

PS. BeautifulSoup has added method names that use underscores instead of CamelCase. They do the same thing, but comform to the PEP8 style guide recommendations. Thus, prefer find_next_siblings over fetchNextSiblings.

Share:
34,771
nutship
Author by

nutship

Updated on March 20, 2020

Comments

  • nutship
    nutship about 4 years

    I'm playing with BeautifulSoup 4 and I have this html code:

    </tr>
              <tr>
    <td id="freistoesse">Giraffe</td>
    <td>14</td>
    <td>7</td>
    </tr>
    

    I want to match both values between <td> tags so here 14 and 7.

    I tried this:

    giraffe = soup.find(text='Giraffe').findNext('td').text
    

    but this only matches 14. How can I match both values with this function?

  • nutship
    nutship about 11 years
    thanks, dunno why #1 method raised me an error: AttributeError: 'NavigableString' object has no attribute 'find_all' ideas?
  • unutbu
    unutbu about 11 years
    soup.find(text='Giraffe') returns a NavigableString. Using soup.find('td', text='Giraffe') gives you the td tag instead. From there, call fetchNextSibling().
  • nutship
    nutship about 11 years
    Thanks a ton for this quick help!