How to scrape instagram account info in python

13,614

Solution 1

As the content you look for are available in page source, you can fetch them using requests in combination with BeautifulSoup.

Give it a try:

import requests
from bs4 import BeautifulSoup

html = requests.get('https://www.instagram.com/michaeljackson/')
soup = BeautifulSoup(html.text, 'lxml')
item = soup.select_one("meta[property='og:description']")
name = item.find_previous_sibling().get("content").split("•")[0]
followers = item.get("content").split(",")[0]
following = item.get("content").split(",")[1].strip()
print(f'{name}\n{followers}\n{following}')

Results:

Name :Michael Jackson
Followers :1.6m
Following :4

Solution 2

I don't know why you would like to avoid using BeautifulSoup, since it is actually quite convinient for tasks like this. So, something along the following lines should do the job:

import requests
from bs4 import BeautifulSoup

html = requests.get('https://www.instagram.com/cristiano/') # input URL here
soup = BeautifulSoup(html.text, 'lxml')

data = soup.find_all('meta', attrs={'property':'og:description'})
text = data[0].get('content').split()

user = '%s %s %s' % (text[-3], text[-2], text[-1])
followers = text[0]
following = text[2]

print('User:', user)
print('Followers:', followers)
print('Following:', following)

...output:

User: Cristiano Ronaldo (@cristiano)

Followers: 111.5m

Following: 387

Of course, you would need to do some calculations to get an actual (yet truncated) number in cases where the user has more than 1m followers (or is following more than 1m users), which should not be too difficult.

Solution 3

otherwise you can access the information in that way (yes, I used beautifulsoup)

from bs4 import BeautifulSoup
import urllib

external_sites_html = 
urllib.urlopen('https://www.instagram.com/<instagramname>/?hl=en')
soup = BeautifulSoup(external_sites_html, 'lxml')

name = soup.find('meta', attrs={'property':'og:title'})
description = soup.find('meta', attrs={'property':'og:description'})

# name of user
nameContent = name.get('content')
# information about followers and following users
descrContent = description.get('content')

from that variables you can extract the informations you need. but information about followers will be inaccurate , if they have more than 1 million numbers. if you need the exact number, you may have to use their api.

Solution 4

import requests

username = "cristiano"
headers = {'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/39.0.2171.95 Safari/537.36'}

user_info = requests.get('https://instagram.com/%s/?__a=1'%username, headers = headers)

print (user_info.json())
Share:
13,614

Related videos on Youtube

user53558
Author by

user53558

Updated on June 04, 2022

Comments

  • user53558
    user53558 about 2 years

    I am trying to do something extremely simple in python yet somehow it's very difficult. All I want to do is write a python script that records the number of people a Instagram user is following, and the number of it's followers. That's it.

    Can anyone point me to a good package to do this? preferably not beautiful soup as that is overly complicated for what I want to do. I just want something like

    [user: example_user, followers:9019, following:217] 
    

    Is there an Instagram specific python library?

    The account I want to scrape is public. This is very simple to do for twitter.

    Any help is appreciated.

    • Kaushik NP
      Kaushik NP almost 7 years
      Have you checked for Instagram APIs?
  • user53558
    user53558 almost 7 years
    Is the "external_sites_html = " supposed to assign the line under it? Or am I supposed to input something there?
  • JustOneQuestion
    JustOneQuestion almost 7 years
    i copy&paste that :) its only used in the line below.
  • Bogota
    Bogota about 4 years
    is there any way I can get those 4 followings?
  • Suraj
    Suraj almost 4 years
    github.com/amitupreti/Instagram-Follower-Scraper You can use the code in this repository.
  • Coder
    Coder over 3 years
    please add an explanation
  • kostek
    kostek about 2 years
    This works but you will be quickly blocked by Instagram. There are APIs for scraping that use proxy to make requests and avoid getting blocked. Here is one of them that has a good tutorial for scraping Instagram: scrapingfish.com/blog/scraping-instagram