Strange character encoding issue with Eclipse / Spring / Tomcat 6

22,831

Solution 1

The quest

I got exactly the same problem than yours with a very similar configuration (Tomcat, Spring, Spring Web Flow, JSF2).

Little facts about my own investigations:

  • WAR under Tomcat Window: encoding problem,
  • same WAR under Tomcat Linux: no problem → suspect OS default encoding as Linux is in UTF-8,
  • same WAR under Tomcat run by Eclipse WTP on Windows: no problem → WTF?!
  • passing properties files in UTF-8 with natural latin characters instead of unicode placeholders: solve the problem for externalized labels,
  • same in Facelets (JSF2 pages): always get the problem, only thing working is <f:verbatim>&amp;eacute;</f:verbatim>.

Still getting the problem, after having checked all my code for classic prerequisites and recommandations found on forums:

  • <?xml version="1.0" encoding="UTF-8" ?> at top of XML files,
  • <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" /> inside HTML header of same files,
  • encoding="UTF-8" in <f:view>.

The configuration of Tomcat in the following ways did nothing:

  • URIEncoding="UTF-8" on connector in server.xml (normal because it concerns URI encoding not page encoding)
  • org.springframework.web.filter.CharacterEncodingFilter on and off,
  • also that (I presumably miss the point here):

    <locale-encoding-mapping-list>
      <locale-encoding-mapping>
        <locale>fr</locale>
        <encoding>UTF-8</encoding>
      </locale-encoding-mapping>
    </locale-encoding-mapping-list>
    

The key

I found the solution comparing the Tomcat command line between WTP and classic command-line MS-DOS Tomcat launch. The only difference is the parameter -Dfile.encoding=UTF-8. It was the key for me to solve the problem.

Set JAVA_OPTS=-Dfile.encoding="UTF-8" and it works fine.

The (attempted) explanation

The only explanation I found, Tomcat use JVM encoding which is by default the system encoding (UTF-8 on Linux, CP1252 on Windows). Eclipse WTP force the JVM encoding according to its workspace encoding settings. Passing JVM in UTF-8 gives the solution.

I suspect it's not really the right one and that there is a configuration problem either on my stack or on resources filtering made either by maven-resources-plugin or maven-war-plugin, but I haven't found it yet.

Solution 2

You need to configure Eclipse to save the files as UTF-8.

Go to Window > Preferences, enter filter text encoding in top, explore all sections to set everything to UTF-8. Specifically for JSP files this is in Web > JSP Files > Encoding. Choose the topmost UTF-8 option (called "ISO 10646/Unicode(UTF-8)").

For properties files this is a story apart. As per the specification, they will by default be read as ISO-8859-1. You need either native2ascii tool for this or supply a custom properfies file loader which uses UTF-8. For more detail, see this article.

Solution 3

I'm using Tomcat 7 with Spring frameworks and using <jsp:include page="anyFile.html"/> in JSP fail and give me a java.lang.IllegalStateException. The <jsp:include> works fine if i want to include another JSP file instead of a static HTML file though but when I'm trying to inject static HTML file it keep giving me this exception in relation with the Character Encoding.

Using <jsp:directive.include file="anyFile.html" /> or <%@include file="anyFile.html"%> works but all the special character ("é", "è", "ç" etc.) appear coded into ISO-8891 instead of UTF-8 even if the JSP file have the <%@page contentType="text/html" pageEncoding="UTF-8"%> and the <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" /> in it.

I found the solution by using the JSLT tag library with the import tag:

  1. put this into the JSP: <%@taglib uri="http://java.sun.com/jsp/jstl/core" prefix="c"%>

  2. Then get the HTML file I want to include using this: <c:import url="anyFile.html" charEncoding="UTF-8"/>

Has you can see the import tag from the JSLT library have a charEncoding attribute that can set the html file to the appropriate Character encoding and display it's content correctly.

Share:
22,831
Czar
Author by

Czar

Updated on April 14, 2020

Comments

  • Czar
    Czar about 4 years

    I have been trying things all day but can't get a proper solution. My problem is: I am developing a Spring MVC based app in my local Tomcat. My MySQl database has UTF-8 encoding set, all content in there displays properly when using phpMyAdmin. Also the output in LOG files using log4j in catalina.out works fine.

    My JSP pages are configured by

    <!-- encoding -->
    <%@ page contentType="text/html; charset=UTF-8" %>
    <%@ page pageEncoding="UTF-8" %>
    

    Also showing data on my JSP works fine. I can also send data from my Controller without any DB intereference using special chars, e.g.

    String str = "UTF-8 Test: Ä Ö Ü ß è é â";
    logger.debug(str);
    mav.addObject("utftest", str);
    

    That displays correctly in log and on jsp page in browser.

    BUT: When having special chars directly in my JSP file, e.g. for text in headers, this does not work. FF and Google Chrome display strange chars but report the page to be UTF-8. When switching to Latin, the chars just get more and more strange.

    Same problem when showing text tokens from my messages.properties file, although Eclipse says when right-clicking that UTF-8 will be used.

    I am a little at lost and don't know where to check now.

    Summary:

    • DB storage is fine
    • DB output on JSP is fine
    • Output on JSP directly form controller is fine
    • even reading in form forms is fine
    • .properties files and JSP text is not fine !!!

    Any ideas? I really appreciate and tips.