Get the href text of a link that has a certain class attribute using BeautifulSoup in Python

13,465

Solution 1

Use the .find() or .find_all() method in order to select element(s) that have a href attribute and a class attribute of Unique_Class_Name. Then iterate over the elements and access the href attribute value:

soup = BeautifulSoup(html)
anchors = soup.find_all('a', {'class': 'Unique_Class_Name', 'href': True})

for anchor in anchors:
    print (anchor['href'])

You could alternatively use a basic CSS selector with the .select() method:

soup = BeautifulSoup(html)

for anchor in soup.select('a.Unique_Class_Name'):
    if anchor.has_attr('href'):
        print (anchor['href'])

Solution 2

<a class="blueText" href="/info/046386294000000899/?s_bid=046386294000000899&amp;s_sid=FSP-LSR-002&amp;s_fr=V01&amp;s_ck=C01" target="_blank">川村商店</a>

You can get the only text like this

for url in url_list:
    res = requests.get('%s' % url)
    soup = bs4.BeautifulSoup(res.text, "html.parser")
    for p in soup.find_all('a', class_='blueText'):
        print(p.text) 
Share:
13,465
ddschmitz
Author by

ddschmitz

Graduate Student at Dakota State University. Cyber Corps recipient. Most recent internship was with Los Alamos National Laboratory. I enjoy computer things.

Updated on July 14, 2022

Comments

  • ddschmitz
    ddschmitz almost 2 years

    How do I get just the text from a href in an anchor tag that matches a class. So if I have

    <a href="Link_I_Need.html" class="Unique_Class_Name">link text</a>
    

    how can I get the string Link_I_Need.html from only the anchor tag with the class Unique_Class_Name?