What is the correct URL format to send to an HTTP proxy server?

17,200

You only send absolute URLs with HTTP. They are useless with HTTPS anyway because the proxy does not see them. It only sees the CONNECT header, everything else is encrypted.

It is not the proxy that responds with an invalid URL, it's the server itself. The proxy cannot see or the response either because it's encrypted as well.

Share:
17,200
Graeme Perrow
Author by

Graeme Perrow

I am a software developer working for SAP in Waterloo, Ontario. I am a member of the SAP HANA Cockpit engineering team. For twenty years, I was a member of the SAP SQL Anywhere engineering team. I write primarily in Javascript, C, C++, python, and perl. In my spare time, I enjoy sports, primarily lacrosse, baseball, and hockey.

Updated on June 07, 2022

Comments

  • Graeme Perrow
    Graeme Perrow almost 2 years

    When my application requests a particular URL from a server (over https), it gets a 301 Moved Permanently redirection. However the Location header is badly-formed. I see something like this:

    > GET https://myserver/url HTTP/1.1
    < 301 Moved Permanently
    < Location: https://redirectedserverhttp://myserver/url
    

    If I send the request without the host, I get a correctly-formed URL:

    > GET /url HTTP/1.1
    < 301 Moved Permanently
    < Location: https://redirectedserver/url
    

    I am going through a proxy server and according to RFC 2068 section 5.1.2, "The absoluteURI form is required when the request is being made to a proxy" so it looks like I'm doing it the right way but the proxy is responding incorrectly. If I try this through a browser, curl, or wget it works fine. I looked at the wget code and the logic looks like:

    if( proxy && !https ) {
        use absoluteURI
    } else {
        use relativeURI
    }
    

    Wget even has a comment in its source code:

    /* When using SSL over proxy, CONNECT establishes a direct
       connection to the HTTPS server.  Therefore use the same
       argument as when talking to the server directly. */
    

    Is this an actual standard defined somewhere? If the absolute URI form is supposed to be used, why do the other tools not use it, and why is it failing?