Get the code from inspect element using Python
One way to get at the source code of a web page is to use the Beautiful Soup library. A tutorial of this is shown here. The code from the page is shown below, the comments are mine. This particular code does not work as the contents have changed on the site it uses as an example, but the concept should help you to do what you want to do. Hope it helps.
from bs4 import BeautifulSoup
from urllib2 import urlopen
BASE_URL = "http://www.chicagoreader.com"
def get_category_links(section_url):
# Put the stuff you see when using Inspect Element in a variable called html.
html = urlopen(section_url).read()
# Parse the stuff.
soup = BeautifulSoup(html, "lxml")
# The next two lines will change depending on what you're looking for. This
# line is looking for <dl class="boccat">.
boccat = soup.find("dl", "boccat")
# This line organizes what is found in the above line into a list of
# hrefs (i.e. links).
category_links = [BASE_URL + dd.a["href"] for dd in boccat.findAll("dd")]
return category_links
EDIT 1: The solution above provides a general way to web-scrape, but I agree with the comments to the question. The API is definitely the way to go for this site. Thanks to yuvi for providing it. The API is available at https://github.com/500px/PxMagic.
EDIT 2: There is an example of your question regarding getting links to popular photos. The Python code from the example is pasted below. You will need to install the API library.
import fhp.api.five_hundred_px as f
import fhp.helpers.authentication as authentication
from pprint import pprint
key = authentication.get_consumer_key()
secret = authentication.get_consumer_secret()
client = f.FiveHundredPx(key, secret)
results = client.get_photos(feature='popular')
i = 0
PHOTOS_NEEDED = 2
for photo in results:
pprint(photo)
i += 1
if i == PHOTOS_NEEDED:
break
EspenG
Updated on August 21, 2022Comments
-
EspenG almost 2 years
In the Safari browser, I can right-click and select "Inspect Element", and a lot of code appears. Is it possible to get this code using Python? The best solution would be to get a file with the code in it.
More specifically, I am trying to find the links to the images on this page: http://500px.com/popular. I can see the links from "Inspect Element" and I would like to retrieve them with Python.
-
EspenG almost 10 yearsWhen I run: html = urlopen("500px.com/popular").read(), I can't find the links.. Is not the same code as if I right-click in my browser and select "Inspect Element".
-
Sven Marnach almost 10 years@user3465589: The links are late-loaded by some JavaScript embedded in that page. It will be really difficult to get the links unless you simply use the API. Honestly.
-
gary almost 10 years@user3465589 I copied and pasted code from an example written by the author of the API.
-
EspenG almost 10 yearsOkay! But how do I install the API library..?
-
yuvi almost 10 years@user3465589 you don't install an API, you use it (in this case - you need to download and use the client)
-
gary almost 10 yearsRight, some packages can be installed with pip (i.e.
pip install some-package-name
) but this API should just be downloaded and stored to a location where Python can see it.