Classic ASP text substitution and UTF-8 encoding
Solution 1
UTF-8 does not use BOMs; it is an annoying misfeature in some Microsoft software that puts them there. You need to find what step of your release process is putting a UTF-8-encoded BOM in your files and fix it — you should stop that even if you are using UTF-8, which really these days is best.
But I doubt it's IIS causing the display problem. More likely the browser is guessing the charset of the final displayed page, and when it sees bytes that look like they're UTF-8 encoded it guesses the whole page is UTF-8. You should be able to stop it doing that by stating a definitive charset by using an HTTP header:
Content-Type: text/html;charset=iso-8859-1
and/or a meta element in the HTML
<meta http-equiv="Content-Type" content="text/html;charset=iso-8859-1" />
Now (assuming ISO-8859-1 is actually the character set your data are in) it should display OK. However if your file really does have a UTF-8-encoded BOM at the start, you'll now see that as ‘’ in your page, which is what those bytes look like in ISO-8859-1. So you still need to get rid of that misBOM.
Solution 2
I was searching on the same exact issue yesterday and came across:
http://blog.inspired.no/utf-8-with-asp-71/
Important part from that page, in case it goes away...
ASP CODE:
Response.ContentType = "text/html"
Response.AddHeader "Content-Type", "text/html;charset=UTF-8"
Response.CodePage = 65001
Response.CharSet = "UTF-8"
and the following HTML META tag:
<meta http-equiv="Content-Type" content="text/html;charset=UTF-8" />
We were using the meta tag and asp CharSet property, yet the page still didn't render correctly. After adding the other three lines to the asp file everything just worked.
Hope this helps!
Phrygian Moon
Just your average joe trying to make code better. I work for the best company in the world Featurist. We solve challenging problems for some very large companies. We also make Shipping Report the best project reporting tool available.
Updated on April 28, 2020Comments
-
Phrygian Moon about 4 years
We have a website that uses Classic ASP.
Part of our release process substitutes values in a file and we found a bug in it where it will write the file out as UTF-8.
This then causes our application to start spitting out garbage. Apostrophes get returned as some encoded characters.
If we then go an remove the BOM that says this file is UTF-8 then the text that was previously rendered as garbage is now displayed correctly.
Is there something that IIS does differently when it encounters UTF-8 a file?
-
Phrygian Moon over 14 yearsRight this makes sense. It was actually a bug in some code that was written specifically to handle this kind of issue. Thanks.
-
AnthonyWJones over 14 yearsI must admit this answer confuses me. "UTF-8 does not use BOMs" could you eloborate? In what way is this a "misfeature" ? I've never come across a problem using UTF-8 files that include this zero width space character, what problems have you encountered?
-
Amit Patil over 14 yearsAny bytes-based text tool (such as shells, config file loaders etc.) will immediately fall over when presented with “” at the start of a file; it is the explicit aim of UTF-8 to be compatible with tools that know nothing about Unicode, but UTF-8+BOM breaks this. Even some Unicode-aware tools will trip over it because a BOM is only expected to be present and automatically removed by the Unicode decoding process for UTF-16. UTF-8+BOM breaks applications and there is no justification for using it in the Unicode specs; and there isn't even any benefit to it as UTF-8 has no byte order issues.
-
Áxel Costas Pena over 10 yearsAlso confused about "UTF-8 does not use BOMs", there is no clarification needed, it's simply a wrongly-built affirmation.
-
user692942 over 10 yearsYou don't need both the meta tag and
Response.CharSet = "UTF-8"
as they both serve the same purpose, personally I prefer to useResponse.CharSet = "UTF-8"
rather then explicitly setting it as a meta tag in html. AlsoResponse.AddHeader "Content-Type", "text/html;charset=UTF-8"
is just an explicit form of writingResponse.ContentType = "text/html"
andResponse.CharSet = "UTF-8"
what you are suggesting is pointless, stick to usingResponse.ContentType
andResponse.CharSet
. -
MistyDawn over 4 yearsImplicitly declaring your charSet and contentType in a meta tag meets W3C standards of acceptable practices. Regardless of how you decide to declare the headers in your asp, redundant or not, you should still include a meta tag that declares the content type and charset. If you run a page through the W3C validation checker at validator.w3.org/i18n-checker it will fail without the meta tag for type declaration. It's better, in this particular case, to have too many declarations than too few.