request.getQueryString() seems to need some encoding

75,314

Solution 1

I've run into this same problem before. Not sure what Java servlet container you're using, but at least in Tomcat 5.x (not sure about 6.x) the request.setCharacterEncoding() method doesn't really have an effect on GET parameters. By the time your servlet runs, GET parameters have already been decoded by Tomcat, so setCharacterEncoding won't do anything.

Two ways to get around this:

  1. Change the URIEncoding setting for your connector to UTF-8. See http://tomcat.apache.org/tomcat-5.5-doc/config/http.html.

  2. As BalusC suggests, decode the query string yourself, and manually parse it (as opposed to using the ServletRequest APIs) into a parameter map yourself.

Hope this helps!

Solution 2

From the HttpServletRequest#getQueryString() javadoc:

Returns: a String containing the query string or null if the URL contains no query string. The value is not decoded by the container.

Note the last statement. So you need to URL-decode it youself using java.net.URLDecoder.

String queryString = URLDecoder.decode(request.getQueryString(), "UTF-8");

However, the normal way to gather parameters is just using HttpServletRequest#getParameter().

String param = request.getParameter("param"); // così

The servletcontainer has already URL-decoded it for you then if you have configured it to use the correct encoding. The request.setCharacterEncoding() has only effect on the request body (POST) not on the request URI (GET). Also see Mirage's answer.

Solution 3

It really took all day but :

final String param = new String(request.getParameter("param").getBytes(
                "iso-8859-1"), "UTF-8");

See also here. Note that this is valid iff the decoding charset (URIEncoding in tomcat) of the server is iso-8859-1 - otherwise this charset must be passed in. For an example of how to get the URIEncoding charset from the server.xml for Tomcat 7 see my quoted answer

Solution 4

For POST request I resolved the problem next way.

  1. Set URIEncoding="UTF-8" attr in server.xml for Connector; (I use Tomcat 8)
  2. Set request.setCharacterEncoding("UTF-8") before parameters retrieving.

Finally, I have got correct utf-8 characters deliery:

e.g.

String name = request.getParameter("name");

name contains correct utf-8 string.

Solution 5

There are many factors affect to http request params encoding. you can reference sequencial guide for this problem.

1.check your form's accept character encoding.

<form id="edit-box" name="edit-box-name" method="post" accept-charset="UTF-8">

2.check http server's default character encoding value. In the case of apache http server, add "AddDefaultCharset UTF-8" string to httpd.conf file.

3.if you have back end server, check backend server's character encoding value. In the case of tomcat backend server, add "URIEncoding="UTF-8" attribute to your Connector. like,

<Connector port="8080" protocol="HTTP/1.1" connectionTimeout="20000"  redirectPort="8443" URIEncoding="UTF-8"/>

...

guide for http request parameter encoding problems

Share:
75,314
Gabriele
Author by

Gabriele

Updated on July 09, 2022

Comments

  • Gabriele
    Gabriele almost 2 years

    I have some problem with UTF-8. My client (realized in GWT) make a request to my servlet, with some parametres in the URL, as follow:

    http://localhost:8080/servlet?param=value
    

    When in the servlet I retrieve the URL, I have some problem with UTF-8 characters. I use this code:

    protected void service(HttpServletRequest request, HttpServletResponse response) 
                        throws ServletException, IOException {
    
            request.setCharacterEncoding("UTF-8");
    
            String reqUrl = request.getRequestURL().toString(); 
            String queryString = request.getQueryString();
            System.out.println("Request: "+reqUrl + "?" + queryString);
            ...
    

    So, if I call this url:

    http://localhost:8080/servlet?param=così
    

    the result is like this:

    Request: http://localhost:8080/servlet?param=cos%C3%AC
    

    What can I do to set up properly the character encoding?