Unable to get page source code in python
Solution 1
I tried it and the requests works, but the content that you receive says that your browser must accept cookies (in french). You could probably get around that with urllib2
, but I think the easiest way would be to use the requests
lib (if you don't mind having an additional dependency).
To install requests
:
pip install requests
And then in your script:
import requests
url = 'http://france.meteofrance.com/france/meteo?PREVISIONS_PORTLET.path=previsionsville/750560'
response = requests.get(url)
print(response.content)
I'm pretty sure the source code of the page will be what you expect then.
Solution 2
requests
library worked for me as Martin Maillard showed.
Also in another thread I have noticed this note by leoluk here:
Edit: It's 2014 now, and most of the important libraries have been ported and you should definitely use Python 3 if you can. python-requests is a very nice high-level library which is easier to use than urllib2.
So I wrote this get_page procedure:
import requests
def get_page (website_url):
response = requests.get(website_url)
return response.content
print get_page('http://example.com')
Cheers!
Admin
Updated on June 07, 2022Comments
-
Admin almost 2 years
I'm trying to get the source code of a page by using:
import urllib2 url="http://france.meteofrance.com/france/meteo?PREVISIONS_PORTLET.path=previsionsville/750560" page =urllib2.urlopen(url) data=page.read() print data
and also by using a
user_agent(headers)
I did not succeed to get the source code of the page!Have you guys any ideas what can be done? Thanks in Advance