How to set session cookie while extracting contents from URLs using beautiful soup?

13,130

Have you tried requests ?

It is possible to persist cookies across a session.

import requests
s = requests.Session()
s.post('https://example.net/users/101', data = {'username' : 'sup', 'password' : 'pass'})
r = s.get("https://example.net/users/101")
soup = BeautifulSoup(r.text)

more about requests.Session()

http://docs.python-requests.org/en/latest/user/advanced/

Share:
13,130
shahnaz shariff
Author by

shahnaz shariff

Updated on July 26, 2022

Comments

  • shahnaz shariff
    shahnaz shariff almost 2 years

    Consider the code:

    from bs4 import BeautifulSoup
    from urllib.request import urlopen
    content = urlopen('https://example.net/users/101')
    soup = BeautifulSoup(content)
    divTag = soup.find_all("div", {"class":"classname"})
    print(divTag)
    for tag in divTag:
       ulTags = tag.find_all("ul", {"class":"classname"})
       for tag in ulTags:
           aTags = tag.find_all("li")
           for tag in aTags:
               name = tag.find('a')['href']
               print(name)
    

    If i use,

    content = open("try.html","r")
    

    I get the required output.

    Here, example.net can be accessed only after entering username & password. The above code does not print anything although the parsing is done correctly.How to add the session cookie value to this code ?

  • Jitin
    Jitin over 3 years
    how do you add multiple cookies ?