HTML forms: issues combining charset with enctype in Firefox

15,972

Solution 1

charset is not a registered parameter for the multipart/form-data media type. It shouldn't do anything.

According to RFC2388, the charset of the submitted fields should actually be passed by the browser in a Content-Type header of the form-data subpart. In practice no browser does this.

accept-charset can't be used because it's broken in IE: instead of choosing the charset for the submission it actually specifies an alternative charset to use, on a per-field basis, when characters do not fit in the primary charset (which is the charset of the current page). This effectively mangles your strings as you cannot find out which charset IE actually used.

The only effective way to make browsers submit your forms as UTF-8 is to serve the page containing the form as UTF-8, by setting a Content-Type: text/html;charset=utf-8 header, including a <meta> HTTP-equivalent, or both (can be a good idea if the user saves the page to disc, losing the header information).

Solution 2

The problem is not the form data, but the filename field - which simply does not work if you need utf-8 and file data, so if you need to process the filename on the server, which is common, you are messed up.

If you set enctype="multipart/form-data;charset=UTF-8" in your form, Tomcat 6 converts this to: content type: application/x-www-form-urlencoded, which is the problem.

It has taken me ages to track this down, but it looks like it is broken in general, and I have tested this with HTTP requests from web browser, and also .Net, with same effect.

Share:
15,972
burton
Author by

burton

I like to write.

Updated on June 04, 2022

Comments

  • burton
    burton almost 2 years

    I have a Web site with a message board. The board lets people post messages and include attachments. I had a problem where my site was hiccuping every time someone wrote a post with non-Unicode characters. In an effort to solve it, I changed my HTML form code from

    enctype="multipart/form-data"
    

    (as I'm accepting file uploads) to:

    enctype="multipart/form-data;charset=UTF-8"
    

    This solved the character problem. But it broke the file upload capability in Firefox 2 through 3.5. Firefox accepts all the text that the user submits, but not the file data. It acts totally like it should, but as if there was no file submitted. Everything works fine in Safari.

    I also tried

    enctype="multipart/form-data" accept-charset="UTF-8"
    

    ...but that had no effect on the character problem.

    Any ideas for ways around this?