Download a file via a proxy java
Solution 1
It is possible to use the library Apache httpclient that solves most of the issue with proxies. To compile the code below, you can use the following maven:
Maven:
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>stackoverflow.test</groupId>
<artifactId>proxyhttp</artifactId>
<version>0.0.1-SNAPSHOT</version>
<packaging>jar</packaging>
<name>proxy</name>
<url>http://maven.apache.org</url>
<properties>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
</properties>
<dependencies>
<dependency>
<groupId>junit</groupId>
<artifactId>junit</artifactId>
<version>3.8.1</version>
<scope>test</scope>
</dependency>
<dependency>
<groupId>org.apache.httpcomponents</groupId>
<artifactId>httpclient</artifactId>
<version>4.5.1</version>
</dependency>
</dependencies>
</project>
Java code:
import org.apache.http.HttpHost;
import org.apache.http.client.config.RequestConfig;
import org.apache.http.client.methods.CloseableHttpResponse;
import org.apache.http.client.methods.HttpGet;
import org.apache.http.impl.client.CloseableHttpClient;
import org.apache.http.impl.client.HttpClients;
import org.apache.http.util.EntityUtils;
/**
* How to send a request via proxy.
*
* @since 4.0
*/
public class ClientExecuteProxy {
public static void main(String[] args)throws Exception {
CloseableHttpClient httpclient = HttpClients.createDefault();
try {
HttpHost target = new HttpHost("www.google.com", 80, "http");
HttpHost proxy = new HttpHost("127.0.0.1", 8889, "http");
RequestConfig config = RequestConfig.custom()
.setProxy(proxy)
.build();
HttpGet request = new HttpGet("/");
request.setConfig(config);
System.out.println("Executing request " + request.getRequestLine() + " to " + target + " via " + proxy);
CloseableHttpResponse response = httpclient.execute(target, request);
try {
System.out.println("----------------------------------------");
System.out.println(response.getStatusLine());
System.out.println(EntityUtils.toString(response.getEntity()));
} finally {
response.close();
}
} finally {
httpclient.close();
}
}
}
Solution 2
The following is different from the other answers and works for me: set these properties before the connection:
System.getProperties().put("http.proxySet", "true");
System.getProperties().put("http.proxyHost", "my.proxy.com");
System.getProperties().put("http.proxyPort", "8080"); //port is String, not int
Then, open the URLConnection and try to download the file.
Solution 3
To set a proxy programmatically:
SocketAddress addr = new InetSocketAddress("my.proxy.com", 8080);
Proxy proxy = new Proxy(Proxy.Type.HTTP, addr);
URL url = new URL("http://my.real.url.com/");
URLConnection conn = url.openConnection(proxy);
Then you can use your code above with the URLConnection
returned on the last line. You can also use a SOCKS proxy, or force no proxy, if you so desire.
This was taken (and slightly edited) from this Oracle documentation.
Solution 4
Another approach is to implement the proxy "inside" each instance of httpUrlConnection. That is:
- Do not connect to the real URL you want. First, connect to the proxy IP and port, but with the http GET method refering to the URL you want.
- Use the setRequestProperty to set the host to your URL's and any other header you may need.
If it works, the connection will transparently send the file to you.
I have some code that worked with Sockets.
try {
Socket sock = new Socket("10.0.241.1", 3128); //proxy IP and port
InputStream is = sock.getInputStream();
OutputStream os = sock.getOutputStream();
String str = "GET http://www.uol.com.br HTTP/1.1\r\n"; //GET your site
str += "Host: www.uol.com.br\r\n"; //again, Host of your site
str += "Proxy-Authorization: Basic ZWR1YXJkby5wb2NvOmM1NmQyMw==\r\n"; //if password is needed
str += "\r\n";
os.write(str.getBytes());
byte[] bb = new byte[1024];
int L = 0;
while ((L = is.read(bb)) != -1) {
//write bytes to file stream...
}
} catch (Exception ex) {
//exception handling...
}
"Why would somebody use pure sockets when one could use httpUrlConnection?", you say. Well, by that time, I didn't know about httpUrlConnection.
Exagon
Updated on June 08, 2022Comments
-
Exagon almost 2 years
i have a problem downloading a file from a url like
www.example.com/example.pdf
via a proxy and saving it on the filesystem in java. Does anybody have an Idea on how this could work? if I get the InputStream i can simply save it to filesystem with this:final ReadableByteChannel rbc = Channels.newChannel(httpUrlConnetion.getInputStream()); final FileOutputStream fos = new FileOutputStream(file); fos.getChannel().transferFrom(rbc, 0, Long.MAX_VALUE); fos.close();
but how to get the inputstream of the a url via a prox? if i am doing it like this:
SocketAddress addr = new InetSocketAddress("my.proxy.com", 8080); Proxy proxy = new Proxy(Proxy.Type.HTTP, addr); URL url = new URL("http://my.real.url.com/"); URLConnection conn = url.openConnection(proxy);
i am getting this exception:
java.net.SocketException: Connection reset at java.net.SocketInputStream.read(Unknown Source) at java.net.SocketInputStream.read(Unknown Source) at java.io.BufferedInputStream.fill(Unknown Source) at java.io.BufferedInputStream.read1(Unknown Source) at java.io.BufferedInputStream.read(Unknown Source) at sun.net.www.http.HttpClient.parseHTTPHeader(Unknown Source) at sun.net.www.http.HttpClient.parseHTTP(Unknown Source) at sun.net.www.http.HttpClient.parseHTTP(Unknown Source) at sun.net.www.protocol.http.HttpURLConnection.getInputStream0(Unknown Source) at sun.net.www.protocol.http.HttpURLConnection.getInputStream(Unknown Source) at app.model.mail.crawler.newimpl.FileLoader.getSourceOfSiteViaProxy(FileLoader.java:167) at app.model.mail.crawler.newimpl.FileLoader.process(FileLoader.java:220) at app.model.mail.crawler.newimpl.FileLoader.run(FileLoader.java:57) at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.lang.Thread.run(Unknown Source)
using this:
final HttpURLConnection httpUrlConnetion = (HttpURLConnection) website.openConnection(proxy); httpUrlConnetion.setDoOutput(true); httpUrlConnetion.setDoInput(true); httpUrlConnetion.setRequestProperty("Content-type", "text/xml"); httpUrlConnetion.setRequestProperty("Accept", "text/xml, application/xml"); httpUrlConnetion.setRequestMethod("POST"); httpUrlConnetion.connect();
i am able to download the source of a site which is html, but not a file maybe someone could help me with the properties i have to set for downloading a file.
-
Exagon over 8 yearsif i am doing it like this i am getting an Exception see my question again i will edit it
-
Eric Galluzzo over 8 yearsUnfortunately it's difficult to tell why the connection would be reset in your case. Have you tried accessing the URL in a browser, with the same proxy settings, and ensured that it works there? Are you using the right type of proxy (SOCKS vs. HTTP)?
-
Exagon over 8 yearsi am using a SOCKS yes i did and it worked... i tried on a lot of other sites now but never worked
-
Eric Galluzzo over 8 yearsDid you change the
Proxy.Type.HTTP
in the code toProxy.Type.SOCKS
? You might try both just in case. -
Eric Galluzzo over 8 yearsHmmm, I'm not sure then. I assume you've verified your proxy host and port in your code. Other than that, I'm not sure what to suggest. :(
-
Exagon over 8 yearsi am using diferent proxys in different threads so this wont work
-
Exagon over 8 yearsI am getting a HTTP response code: 411, a read timeout or a connect timed out ... any ideas?
-
Marco Altieri over 8 years@Exagon I have updated the code because last time I used a code that I wrote for an old version using classes that have been all deprecated. I retested the code using fiddler2 as a proxy. It worked fine. If you get a timeout it is probably a "netwrorking" issue.
-
Marco Altieri over 8 yearsBy the way, the example is just a "copy and paste" of: hc.apache.org/httpcomponents-client-ga/httpclient/examples/org/…
-
Exagon over 8 yearssorry but i am getting an error at request.setConfig(config); "The method setConfig(RequestConfig) is undefined for the type HttpGet"
-
Exagon over 8 yearscould you show how to do this with all the propertys with some code?
-
Marco Altieri over 8 years@exagon What version of the library are you using ? If you do not want to use maven, you can download the version that I used from: central.maven.org/maven2/org/apache/httpcomponents/httpclient/…
-
Exagon over 8 yearsthe newest 4.5.1 my IDE is Eclipse Mars and I am using Java 8.65
-
Marco Altieri over 8 yearsmmm I see... I am not on JDK 8. Let me check
-
Marco Altieri over 8 yearsIt worked for me on JDK 8. Is your error at runtime or compile time?
-
Exagon over 8 yearsits a compile time error ... i dont know why ... also appears when I create a new project and just add the library and this class
-
Marco Altieri over 8 yearsHttpGet has the method setConfig since the beginning. As I said, the example is from the apache httpclient site so it has to work.
-
Eduardo Poço over 8 yearsEdited in the answer. This implementation is from the time when I didn't know about httpUrlConnection, so used sockets. Did the edit in a hurry, I think you can figure out the equivalent operations on a httpUrlConnection. If you need, I'll edit it again to fit a httpUrlConnection.