Issue with character encoding in POST requests sent with Firefox

11,651

Solution 1

As @Pointy mentioned some time ago, the problem was related to the Content-Type of the POST request, because Firefox appears to encode POST messages differently than other browsers. In my head, Data-Type and Content-Type were the same and so, I didn't realize it's necessary to specify UTF-8 as the character encoding standard in both cases. But once I had changed both the Content-Type and the Data-Type as well to a clear "text/xml; charset=UTF-8", the problem was resolved.

Solution 2

I am soooo happy. Thank you guys for posting and figuring this out earlier. It took me a couple hours to get close enough to the problem to find this through googling, but because of your comments, I got this solved in less than a day; and in time for the big presentation tomorrow! :)

It was so bizarre, seeing that all browsers were sending the very same data string in an AJAX request but getting different results, depending on the browser (Firefox being different.)

I tried this, but it didn't work:

req.setRequestHeader ("encoding", "utf-8");

Then I just did what you said Firefox does and one coding solution works in all browsers.

req.setRequestHeader("Content-type", "application/x-www-form-urlencoded;charset=utf-8");

I've tested on Chrome, MSIE, Firefox, Safari, Opera and Opera Next. Works every time!

Share:
11,651
Andrei Oniga
Author by

Andrei Oniga

Striving to become one of the best, one step at a time.

Updated on June 05, 2022

Comments

  • Andrei Oniga
    Andrei Oniga almost 2 years

    Recently I've come across some very strange behavior related to character encoding for AJAX calls made using the POST method. To make a long story short, I have an HTML form with text fields that can accept diacritics (e.g. "ä"). When the form is submitted, the form data is wrapped in an XML block and sent to a server, which stores that information in a MySQL database. Subsequently, that information is retrieved from the database and displayed to regular users, as is.

    If the request is sent from Chrome or IE, everything is fine. This means that the data, including the diacritics, is sent, stored, then retrieved and displayed correctly. However, when I use Firefox for this, the XML appears to submit the form data right, but when I reload the web page, the previously sent diacritics don't appear. In other words, they seem to get lost somewhere along the way. For example, if the XML contains the word "tästä", when I load the page I see "tst".

    Why is this happening? Is Firefox encoding the post messages differently from IE and Chrome?

    In case it helps, I've attached the request and response headers from Chrome and Firefox, for exactly the same form content - only one example:

    By the way, I'm not encoding the data before sending it to the server, just simply retrieving the value of the form fields, as is.

    CHROME:

    The XML data block:

    <request>
    <session>{hidden by me}</session>
    <builder>Hem i Stan tästä</builder>
    </request>
    

    The request headers:

    Accept:*/*
    Accept-Charset:ISO-8859-1,utf-8;q=0.7,*;q=0.3
    Accept-Encoding:gzip,deflate,sdch
    Accept-Language:en-US,en;q=0.8
    Connection:keep-alive
    Content-Length:562
    Content-Type:application/x-www-form-urlencoded
    Cookie:PHPSESSID=rlne2d787j0np52ec5rtn04dm1
    Host:83.150.87.220
    Origin:http://hidden.by.me
    Referer:http://http://hidden.by.me/?c=2094211
    User-Agent:Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.1 (KHTML, like Gecko) Chrome/21.0.1180.89 Safari/537.1
    X-Requested-With:XMLHttpRequest
    

    The response headers:

    Connection:Keep-Alive
    Content-Encoding:gzip
    Content-Type:application/xml
    Date:Mon, 17 Sep 2012 16:21:58 GMT
    Keep-Alive:timeout=5, max=100
    Server:Apache/2.2.11 (Win32) PHP/5.2.9-1
    Transfer-Encoding:chunked
    Vary:Accept-Encoding
    

    FIREFOX:

    The XML data block:

    <request>
    <session>{hidden by me}</session>
    <builder>Hem i Stan tästä</builder>
    </request>
    

    The request headers:

    Accept  */*
    Accept-Encoding gzip, deflate
    Accept-Language en-us,en;q=0.5
    Connection  keep-alive
    Content-Length  562
    Content-Type    application/x-www-form-urlencoded; charset=UTF-8
    Cookie  PHPSESSID=kvfg4fp2trorllim19dmn241c7
    Host    hidden.by.me
    Referer http://hidden.by.me/?c=2094211
    User-Agent  Mozilla/5.0 (Windows NT 6.1; WOW64; rv:14.0) Gecko/20100101 Firefox/14.0.1
    X-Requested-With    XMLHttpRequest
    

    The response headers:

    Connection  Keep-Alive
    Content-Encoding    gzip
    Content-Type    application/xml
    Date    Mon, 17 Sep 2012 16:21:23 GMT
    Keep-Alive  timeout=5, max=100
    Server  Apache/2.2.11 (Win32) PHP/5.2.9-1
    Transfer-Encoding   chunked
    Vary    Accept-Encoding
    
    • Pointy
      Pointy over 11 years
      Notice that the Content-type headers are different: Firefox is sending UTF-8 to your server.
    • Andrei Oniga
      Andrei Oniga over 11 years
      But in both cases, the character encoding is UTF-8. Isn't that just an issue of information layout in Firebug as opposed to the Chrome Inspector?
    • Pointy
      Pointy over 11 years
      I meant the Content-Type in the request header. In Firefox, according to what you posted, it's " application/x-www-form-urlencoded; charset=UTF-8", but that "charset" clause is missing from the Chrome information. Whether it's actually being posted in UTF-8 I can't say; your server should be able to tell. The problem has to be something like that, in any case.
    • Andrei Oniga
      Andrei Oniga over 11 years
      I understood what you meant, and it's exactly what I was saying, that I believe that to be a matter of data layout in the 2 applications (Firebug/Chrome's code inspector). In other words, the HTTP request parameters, including the char encoding, are set within the JS script. But Firebug displays it next to the content type. Anyway, I'm not sure about it.
    • Pointy
      Pointy over 11 years
      Well, either the two browsers are sending different encodings, or they're expecting different encodings in the response. Such problems are a real pain to figure out :/ You should be able to tell at the server if the strings are arriving byte-for-byte identically, I think.
    • Jay
      Jay over 11 years
      @AndreiOniga: According to DOM specs (URL-encoded form data section), if ACCEPT-CHARSET attribute isn't specified on the form, UTF-8 should be used if the document character set is not ASCII-compatible (single/variable byte encoding; or non UTF-x). Otherwise, use the document character set. - Can't tell which one is wrong since the HTML code isn't provided.
    • PhistucK
      PhistucK over 11 years
      Can you share the request body? and the response body of the malformed string shown by Firefox?
    • Andrei Oniga
      Andrei Oniga over 11 years
      This issue has been resolved, it was indeed what you pointed out: Content-Type issue. In other words, Data-Type and Content-Type were the same in my head, and I didn't realize I had to specify UTF-8 as the character encoding standard in both cases. But once I had changed the Content-Type to a clear "text/xml; charset=UTF-8", as well as the Data-Type, the problem was resolved. Many thanks for the help!
    • Billybonks
      Billybonks over 11 years
      post your answer as the answer so that the question is complete and easy to find the resolution for others