Disabling URL decoding in nginx proxy
Solution 1
Quoting Valentin V. Bartenev (who should get the full credit for this answer):
A quote from documentation:
If proxy_pass is specified with URI, when passing a request to the server, part of a normalized request URI matching the location is replaced by a URI specified in the directive
If
proxy_pass
is specified without URI, a request URI is passed to the server in the same form as sent by a client when processing an original requestThe correct configuration in your case would be:
location /foo { proxy_pass http://localhost:8080; }
Solution 2
Note that URL decoding, commonly known as $uri
"normalisation" within the documentation of nginx, happens before the backend IFF:
either any URI is specified within
proxy_pass
itself, even if just the trailing slash all by itself,or, URI is changed during the processing, e.g., through
rewrite
.
Both conditions are explicitly documented at http://nginx.org/r/proxy_pass (emphasis mine):
If the
proxy_pass
directive is specified with a URI, then when a request is passed to the server, the part of a normalized request URI matching the location is replaced by a URI specified in the directiveIf
proxy_pass
is specified without a URI, the request URI is passed to the server in the same form as sent by a client when the original request is processed, or the full normalized request URI is passed when processing the changed URI
The solution is to either omit the URI as in OPs case, or, indeed, use a clever rewrite
rule:
# map `/foo` to `/foo`:
location /foo {
proxy_pass http://localhost:8080; # no URI -- not even just a slash
}
# map `/foo` to `/bar`:
location /foo {
rewrite ^ $request_uri; # get original URI
rewrite ^/foo(/.*) /bar$1 break; # drop /foo, put /bar
return 400; # if the second rewrite won't match
proxy_pass http://localhost:8080$uri;
}
You can see it live in a related Stack Overflow answer, including control group.
Related videos on Youtube
Tomasz Nurkiewicz
Updated on September 18, 2022Comments
-
Tomasz Nurkiewicz over 1 year
When I browse to this URL:
http://localhost:8080/foo/%5B-%5D
server (nc -l 8080
) receives it as-is:GET /foo/%5B-%5D HTTP/1.1
However when I proxy this application via nginx (1.1.19):
location /foo { proxy_pass http://localhost:8080/foo; }
The same request routed through nginx port is forwarded with path decoded:
GET /foo/[-] HTTP/1.1
Decoded square brackets in the GET path are causing the errors in the target server (HTTP Status 400 - Illegal character in path...) as they arrive un-escaped.
Is there a way to disable URL decoding or encode it back so that the target server gets the exact same path when routed through nginx? Some clever URL rewrite rule?
-
Tomasz Nurkiewicz over 11 yearsReported bug to nginx: trac.nginx.org/nginx/ticket/262
-
-
herrtim over 10 yearsI had to change
http://localhost:8080/
tohttp://localhost:8080
in case anyone has the same situation as I did. -
platypus over 10 yearsWhy does Nginx decode the URI before passing it to the backend server? Wouldn't it make more sense if it kept the URI untouched?
-
Congmin about 6 years@platypus, it is kept untouched, until you explicitly start performing the substitutions
-
Michael Hampton about 6 yearsThe documentation is confusing here. Both forms contain a URI. It is the path component that is present in one and missing in the other.
-
Congmin about 6 years@MichaelHampton, I disagree — the PATH is generally called the URI, so, the one without the path, doesn't contain the URI.
-
Michael Hampton about 6 yearsA relative path alone can also be a valid URL, of course. The point is, the remainder is also a valid URI (e.g.
http://localhost:8080
). If you disagree, you can take it up with the authors of RFC 3986. -
Norman Xu over 5 years@MichaelHampton Unforturnately, it seems scheme and path are mandatory to be an URI, authority, arguments, fragment are optional
-
Marc almost 4 yearsIs it just me or is the standard behaviour wacky? We don't want URLs changed just because we happen to rewrite to a path instead of to the root!!
-
Congmin almost 4 years@Marc just you. The standard behaviour is to preemptively address many security pitfalls, and ensure you can't blame your security issues on nginx. P.S. Did you notice the
return 400
in this answer? I bet most folks don't bother to understand what it's for, or deem it unnecessary, even though it's pretty essential for security. -
Marc almost 4 yearsIf I pass
/foo%20bar
to NGINX and it passes literally/yo/foo bar
(an invalid URL containing a space) downstream which then fails then the behaviour is wrong/buggy. See trac.nginx.org/nginx/ticket/1930 -
Congmin almost 4 years@Marc no, you're incorrect, and your comment is very misleading — nginx will never pass a space upstream if you use the correct configuration as has been pointed out in that trac issue you link to; you have an incorrect use of regular expression captures that's causing your problem; your configuration sample is not the best practice even if it'd have worked as you may expect; I agree 100% with the nginx devs in that trac issue that the defect report is invalid.
-
Marc almost 4 yearsOK, I will look at the suggestions there. I still think there should be a way to get escaped URL components like
foo%20bar
- NGINX seems to think we only need unescaped valuesfoo bar
. -
Congmin almost 4 years@Marc again, your statement is incorrect; the devs have pointed out where your problem was and what the correct solution and best practice should be; you never explained why the proposed solution wouldn't work for your usecase; so, frankly, I don't even understand what you're trying to argue here anymore, because your proposed solution (that would let you use configuration that you've been already told is suboptimal in the first place) would break other usecases.
-
Marc almost 4 yearsThat's an arrogant statement. I have taken their feedback onboard. Perhaps you can explain why you think it makes sense to decode URL elements into
$1
? I certainly have illustrated cases where it is a problem and BREAKS HTTP and don't see anyone offering examples where this is a good idea. Why whould NGINX decode URIs into variables?? How can we re-encode them?? -
Congmin almost 4 yearsMarc, your configuration is just wrong. It has been explained in Trac, as well as here, several times. The proper solution has been explained as well; again, you never once indicated why the proper solution that has been suggested wouldn't work for you. If you don't want to follow the proper solution, that doesn't mean that it's nginx that's broken. Please stop posting misleading statements about nginx. What do you think happens when nginx receives a request for
GET /../../../../../../etc/passwd
? Which non-regexlocation
would catch it? What aboutGET /this%20is%20a%20test.txt
?