UTF-8 encoding problem with servlet and apache HttpClient

18,784

Solution 1

Have you tried

response.setCharacterEncoding("utf-8");

instead of setting the encoding via setContentType? It shouldn't make a difference according to the documentation, but who knows...

Also, make sure you didn't call response.getWriter() anywhere in your code before setting the character encoding, because the latter would not have any effect in that case.

Solution 2

Make sure stream bytes are in UTF-8 format:

out.write((yourstring.getBytes("UTF-8"));

Solution 3

StandardCharsets.UTF_8 can be used with EntityUtil to get the proper encoding.

Here is a sample snippet:

HttpEntity entity = response.getEntity();
String webpage = EntityUtils.toString(entity, StandardCharsets.UTF_8);

Solution 4

I've got a similar problem that i solved by using UTF-8 encoding as following:

IOUtils.toString(response.getEntity().getContent(), Charsets.UTF_8)

Namespace:

import com.google.common.base.Charsets;
Share:
18,784
Gabriel Llamas
Author by

Gabriel Llamas

Updated on June 07, 2022

Comments

  • Gabriel Llamas
    Gabriel Llamas almost 2 years

    I have a servlet that sends a string with utf-8 encoding. Also I have a client written with apache httpcomponents library.

    My problem is reading the response in utf-8. Some special characters like ñ or ç are not read correctly. If I test the server with an html page sending a request, the string is correct and the encoding is UTF-8 without BOM.

    Some snippets: Servlet

    response.setContentType ("application/json; charset=UTF-8");
    PrintWriter out = response.getWriter ();
    out.write (string);
    

    Client

    entity = response.getEntity ();
    entity.getContentEncoding (); //returns null
    resultado = EntityUtils.toString (entity, HTTP.UTF_8); //Some characters are wrong
    

    Has anyone had the same problem?

    SOLVED: Sorry guys the client and server were working correctly. I'm writting an android app and it seems that the logcat (where I print the messages) doesn't support utf-8 encoding.

  • Thomas
    Thomas over 13 years
    Can you identify whether the problem is on the servlet side or on the client side?
  • Hiro2k
    Hiro2k over 13 years
    Yeah you should check it out with something like Wireshark.
  • Gabriel Llamas
    Gabriel Llamas over 13 years
    I've also tested the connection with wireshark and I've seen that the special characters are sended incorrectly, but firefox shows it correctly :S. Tested with setCharacterEncoding and text/plain content-type but still fails.
  • Miklos Krivan
    Miklos Krivan over 6 years
    You are right there was no any differences adding or not adding response.setCharacterEncoding("utf-8") after setting contentType with charset information. But when I changed the position of calling response.getWriter() immediately worked perfect. Many thanks.