Using an HTTP PROXY - Python
Solution 1
You can do it even without the HTTP_PROXY environment variable. Try this sample:
import urllib2
proxy_support = urllib2.ProxyHandler({"http":"http://61.233.25.166:80"})
opener = urllib2.build_opener(proxy_support)
urllib2.install_opener(opener)
html = urllib2.urlopen("http://www.google.com").read()
print html
In your case it really seems that the proxy server is refusing the connection.
Something more to try:
import urllib2
#proxy = "61.233.25.166:80"
proxy = "YOUR_PROXY_GOES_HERE"
proxies = {"http":"http://%s" % proxy}
url = "http://www.google.com/search?q=test"
headers={'User-agent' : 'Mozilla/5.0'}
proxy_support = urllib2.ProxyHandler(proxies)
opener = urllib2.build_opener(proxy_support, urllib2.HTTPHandler(debuglevel=1))
urllib2.install_opener(opener)
req = urllib2.Request(url, None, headers)
html = urllib2.urlopen(req).read()
print html
Edit 2014:
This seems to be a popular question / answer. However today I would use third party requests
module instead.
For one request just do:
import requests
r = requests.get("http://www.google.com",
proxies={"http": "http://61.233.25.166:80"})
print(r.text)
For multiple requests use Session
object so you do not have to add proxies
parameter in all your requests:
import requests
s = requests.Session()
s.proxies = {"http": "http://61.233.25.166:80"}
r = s.get("http://www.google.com")
print(r.text)
Solution 2
I recommend you just use the requests module.
It is much easier than the built in http clients: http://docs.python-requests.org/en/latest/index.html
Sample usage:
r = requests.get('http://www.thepage.com', proxies={"http":"http://myproxy:3129"})
thedata = r.content
Solution 3
Just wanted to mention, that you also may have to set the https_proxy
OS environment variable in case https URLs need to be accessed.
In my case it was not obvious to me and I tried for hours to discover this.
My use case: Win 7, jython-standalone-2.5.3.jar, setuptools installation via ez_setup.py
Solution 4
Python 3:
import urllib.request
htmlsource = urllib.request.FancyURLopener({"http":"http://127.0.0.1:8080"}).open(url).read().decode("utf-8")
Related videos on Youtube
Comments
-
RadiantHex about 4 years
I familiar with the fact that I should set the HTTP_RPOXY environment variable to the proxy address.
Generally urllib works fine, the problem is dealing with urllib2.
>>> urllib2.urlopen("http://www.google.com").read()
returns
urllib2.URLError: <urlopen error [Errno 10061] No connection could be made because the target machine actively refused it>
or
urllib2.URLError: <urlopen error [Errno 11004] getaddrinfo failed>
Extra info:
urllib.urlopen(....) works fine! It is just urllib2 that is playing tricks...
I tried @Fenikso answer but I'm getting this error now:
URLError: <urlopen error [Errno 10060] A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond>
Any ideas?
-
Fenikso about 13 yearsCan you post actual whole sample code which gives you the error?
-
RadiantHex about 13 years@Fenikso: this
urllib2.urlopen("http://www.google.com").read()
-
Fenikso about 13 yearsSo you have the proxy server set in HTTP_PROXY environment variable? Are you sure that server accepts the connection?
-
-
RadiantHex about 13 yearsThanks for the reply! :) Now I'm getting
URLError: <urlopen error [Errno 10060] A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond>
... urllib works perfectly though. -
Fenikso about 13 years@RadiantHex - Works fine on my system. Do you have any proxy you have to use for internet access?
-
Fenikso about 13 years@RadiantHex - What is also the type of proxy you use?
-
RadiantHex about 13 years@Fenikso: I do have to use an http proxy for internet access, and it is the same I use for all my software to get internet access. It is the same proxy I have set within the HTTP_PROXY variable.
-
Fenikso about 13 years@RadiantHex - Try setting another user-agent and switch debug mode on. I have updated my answer.
-
RadiantHex about 13 years@Fenikso - Works :| THANKS!!! :)
-
Fenikso about 13 years@RadiantHex - So was it the proxy refusing connection because of user-agent?
-
User about 10 yearsHow do you set the timeout?
-
Heidi over 8 yearsWonderful. This works with both https and http, whereas urllib only works with http for me with python3.
-
Phillip over 7 yearsI thought this was working for me, but tried putting random information passed in with proxies, and data was still retrieved each time (as long as https was used)
-
Phillip over 7 yearsI thought this was working for me, but tried putting random proxy information, and data was still retrieved each time (as long as https was used)
-
Hrvoje T almost 6 yearsIs there a proxy for FTP?
-
bit_scientist almost 6 years@Fenikso, what is
"http": "http://61.233.25.166:80"
lineproxies
argument? Is it gonna be my IP address? -
Fenikso over 5 years@voo_doo it is address and port of the proxy you want to use.
-
IFink over 3 yearsfrom the TraceBack: DeprecationWarning: FancyURLopener style of invoking requests is deprecated. Use newer urlopen functions/methods.
-
Safeer Abbas over 3 yearsdid you ip bind the proxy or is the proxy allowing your accesss?