Java Reading Undecoded URL from Servlet

11,337

Solution 1

There is a fundamental difference between '%2F' and '/', both for the browser and the server.

The HttpServletRequest specification says (without any logic, AFAICT):

  • getContextPath: not decoded
  • getPathInfo: decoded
  • getPathTranslated: not decoded
  • getQueryString: not decoded
  • getRequestURI: not decoded
  • getServletPath: decoded

The result of getPathInfo() should be decoded, but the result of getRequestURI() must not be decoded. If it is, your Servlet container is breaking the spec (as Wouter Coekaerts and Francois Gravel correctly pointed out). Which Tomcat version are you running?

Making matters even more confusing, current Tomcat versions reject paths that contain encodings of certain special characters, for security reasons.

Solution 2

If there's a %2F in the decoded url, it means the encoded url contained %252F.

Since %2F is / Why not just split on "\/" and not worry about URL encoding?

Solution 3

According to the Javadoc, getRequestURI should not decode the string. On the other hand, getServletPath return a decoded string. I tested this locally using Jetty and it behaves as described in the doc.

So there might be something else at play in your situation since the behavior you're describing doesn't match the Sun documentation.

Share:
11,337
Slartibartfast
Author by

Slartibartfast

Updated on June 04, 2022

Comments

  • Slartibartfast
    Slartibartfast almost 2 years

    Let's presume that I have string like '=&?/;#+%' to be a part of my URL, let's say like this:

    example.com/servletPath/someOtherPath/myString/something.html?a=b&c=d#asdf
    

    where myString is the above string. I've encoded critical part so URL looks like

    example.com/servletPath/someOtherPath/%3D%26%3F%2F%3B%23%2B%25/something.html?a=b&c=d#asdf
    

    So far so good.

    When I'm in the servlet and I read any of request.getRequestURI(), request.getRequestURL() or request.getPathInfo(), returned value is already decoded, so I get strilng like

    someOtherPath/=&?/;#+%/something.html?a=b&c=d#asdf
    

    and I can't differentiate between real special characters and encoded ones.

    I've solved particular problem by banning above chars altogether, which works in this situation, but I still wonder is there any way to get undecoded URL in servlet class.

    YET ANOTHER EDIT: When I've hit this problem last evening I was too tired to notice what is really going on, which is even more bizarre! I have servlet mapped on, say /servletPath/* after that I can put whatever I want and get my servlet responding depending on the rest of a path, except when there is %2F in the path. In that case request never hits the servlet, and I get 404! If i put '/' instead of %2F it works OK. I'm running Tomcat 6.0.14 on Java 1.6.0-04 on Linux.