java tomcat utf-8 encoding issue

40,323

Solution 1

If you need to use UTF-8 encoding (and really, everybody should be going this these days), then you can follow the "UTF-8 everywhere HOWTO" found in the Tomcat FAQ:

http://wiki.apache.org/tomcat/FAQ/CharacterEncoding#Q8

Remember that you also need to support UTF-8 in your database's text fields.

Also remember that sometimes "printing" a String with non-ASCII characters in it to a log file or the console can be affected by

  1. The character encoding of the output stream
  2. The character encoding of the file reader (e.g. cat/less/vi)
  3. The character encoding of the terminal

You might be better off writing the values to a file and then using a hex editor to examine the contents to be sure that you are getting the byte values you are looking for.

Solution 2

Here is a small tutorial what you need to do to make UTF-8 work in your web application:

You have to implement Filter in your application for character encoding:

public class CharacterEncodingFilter implements Filter {

    @Override
    public void init(FilterConfig filterConfig)
            throws ServletException {

    }

    @Override
    public void doFilter(ServletRequest servletRequest, ServletResponse servletResponse, FilterChain filterChain)
            throws IOException, ServletException {
        servletRequest.setCharacterEncoding("UTF-8");
        servletResponse.setContentType("text/html; charset=UTF-8");
        filterChain.doFilter(servletRequest, servletResponse);
    }

    @Override
    public void destroy() {

    }
}

You have to make sure that your tomcat's server.xml's file connector element has URIEncoding attribute which value is UTF-8.

<Connector port="8080" 
           protocol="HTTP/1.1"
           connectionTimeout="20000"
           URIEncoding="UTF-8"
           redirectPort="8443"/>

Also you need to specify this in every JSP page:

<%@page contentType="text/html" pageEncoding="UTF-8"%>
Share:
40,323
Evan Chu
Author by

Evan Chu

Updated on July 09, 2022

Comments

  • Evan Chu
    Evan Chu almost 2 years

    I am developing a simple web application using java/jsp/tomcat/mysql, and the most problem lies on the character encoding because I need to deal with UTF-8 encoding instead of the default 8851.

    First of I'd like to describe my program structure. I am using a Servlet called Controller.java to handle all request. So in web.xml, I have a Controller servlet which takes all request from *.do.

    Then this Controller will dispatch the request based on the requested URL, for example, if client asks for register.do, Controller will dispatch the request to Register.java.

    And in the Register.java, there is a method which takes the request as parameter, namely:

    public String perform(HttpServletRequest request) {
        do something with the request...
    }
    

    So the problem is if I want to print something in UTF-8 inside this method, it will give random characters. For example, I have an Enum which stores several constants, one of the properties the Enum has is its name in Traditional Chinese. If I print it in

    public static void main(Stirng[] args{
        System.out.println(MyEnum.One.getChn());
        logger.info(MyEnum.One.getChn());
    }
    

    This is printed correctly in Chinese. However, if I put the exact code inside the method dealing with HttpServletRequest:

    public String perform(HttpServletRequest request) {
        System.out.println(MyEnum.One.getChn());
        logger.info(MyEnum.One.getChn());
    }
    

    They are printed as random characters, but I can see from the debug window (eclipse) that the variables are holding correct Chinese characters.

    So, the same situation happens when I want to store the value from request.getParameter(). In the debug window, I can see the variable is holding correct characters, but one I print it out or try to store it in the database, it is random characters.

    I don't know why the behavior acts like this, and this is blocking me from reading submitted form values and store them into database. Could someone give some hints on this?

    Great thanks.