Scraping from dropdown option value Python BeautifulSoup

11,642

Solution 1

You still keep using findAll() and find() to finish your job.

from bs4 import BeautifulSoup

html = """
<table style="font-size:14px">
<selected name="try">
<option value="G1">1</option>
<option value="G2">2</option>
</selected>
</table>
"""

soup = BeautifulSoup(html,"lxml")

option = soup.find("selected",{"name":"try"}).findAll("option")
option_ = soup.find("table", {"style": "font-size:14px"}).findAll("option")
print(option)
print(option_)
#[<option value="G1">1</option>, <option value="G2">2</option>]
#[<option value="G1">1</option>, <option value="G2">2</option>]

Solution 2

Try an attribute CSS selector

soup.select('option[value]')

The [] is an attribute selector. This looks for option tag elements with value attribute. If there is a parent class/id that could be used that would be helpful in case there are more drop downs available on the page.

items = soup.select('option[value]')
values = [item.get('value') for item in items]
textValues = [item.text for item in items]

With parent name attribute to limit to one dropdown (hopefully - you need to test and see if something further is required to sufficiently limit). Used with descendant combinator:

items = soup.select('[name=try] option[value]')
Share:
11,642

Related videos on Youtube

Ilham Riski
Author by

Ilham Riski

Updated on June 04, 2022

Comments

  • Ilham Riski
    Ilham Riski almost 2 years

    I tried scraping data from the web with input dropdown with BeautifulSoup

    this is value drop down

    <selected name="try">
    <option value="G1">1</option>
    <option value="G2">2</option>
    </selected>
    

    And I try like this

    soup = BeautifulSoup(url, 'html.parser')
    soup['selected'] = 'G1'
    data = soup.findAll("table", {"style": "font-size:14px"})
    print(data)
    

    It will get data with <table> tag each submit dropdown

    but it only appears <table> for the main page, how do I get data from each dropdown?

  • Ilham Riski
    Ilham Riski over 5 years
    soup.select('option[G1]') like this?
  • QHarr
    QHarr over 5 years
    Is there a parent id/class for the drop down?
  • Ilham Riski
    Ilham Riski over 5 years
    <selected name="try"> @QHarr
  • Ilham Riski
    Ilham Riski over 5 years
    I mean can i get table from G1 and G2? based on user input i already have dropdown list <selected name=try> <option value="G1">1</option> <option value="G2">2</option>
  • QHarr
    QHarr over 5 years
    The above will extract the text in those drop downs for each option as well as the values of the value attribute.
  • QHarr
    QHarr over 5 years
    If there is a table somewhere then you need to show more HTML / share the URL if possible.
  • Ilham Riski
    Ilham Riski over 5 years
    its only get option value text, not table tag
  • QHarr
    QHarr over 5 years
    There is no table tag in the html you have shown. Please update to include the full relevant HTML.
  • QHarr
    QHarr over 5 years
    Still no table tag. You have a selected tag and option tag elements.
  • Ilham Riski
    Ilham Riski over 5 years
    w3schools.com/jsref/tryit.asp?filename=tryjsref_select_value‌​2 This example, suppose that the fruit is the <table> tag
  • QHarr
    QHarr over 5 years
    There is no table tag. That is soup.select('#mySelect option[value]') My answer above does the same thing. Just, as you no id specified for the parent I have used the name attribute you gave. It has a parent select tag as is appropriate for a selectable dropdown list.