Convert UTF-8 String Classic ASP to SQL Database
Solution 1
Paul's answer isn't wrong but it is not the only part to consider:
You will need to go through each of these steps to make sure that you are getting consistent results;
IMPORTANT: These steps have to be performed on each and every page in your web application or you will have problems (emphasized by Paul's comment).
Each page needs to be saved using
UTF-8
encoding double check this as some IDEs will default toWindows-1252
(also often misnamed as "ANSI").Each page will need the following line added as the very first line in the page, to make this easier I put this along with some other values in an include file so I can include them in each page as I go.
Include File - page_encoding.asp
<%@Language="VBScript" CodePage = 65001 %> <% Response.CharSet = "UTF-8" Response.CodePage = 65001 %>
Usage in the top of an ASP page (prefer to put in a config folder at the root of the web)
<!-- #include virtual="/config/page_encoding.asp" -->
Response.Charset = "UTF-8"
is the equivalent of setting the;charset
in the HTTPcontent-type
header.Response.CodePage = 65001
tell's ASP to process all dynamic strings asUTF-8
.Include files in the page will also have to be saved using
UTF-8
encoding (double check these also).
Follow these steps and your page will work, your problem at the moment is some pages are being interpreted as Windows-1252
while others are being treated as UTF-8
and you're ending up with a mis-match in encoding.
Solution 2
Normally - and that word has a veryyyyy long stretch - you do not need to convert on hand, even more it's discouraged. At the top off your asp page you write:
<%@LANGUAGE="VBSCRIPT" CODEPAGE="65001"%>
that tell's ASP to send and to receive (from a server point of view) UTF-8. Furthermore it instructs the interpreter to use 2 byte strings. So when writing to a database or reading from a database everything goes auto-magically, so if your database uses 1 byte char
or 2 byte nchar
conversions are taken care of. And actually that's about it. You can test if all goes well by testing with this set:
áäÇçéčëíďńóöçÖöÚü
This set contains some 'European' but also some 'Unicode' chars... those Unicode will always fail if you use codepage 1252, so it's a nice test set.
user1744228
Updated on June 06, 2022Comments
-
user1744228 almost 2 years
So I was having an issue with converting French characters correctly. Basically, I have a form which sends data to an SQL Database. Then, on another page, data from this DB is retrieved and displayed to the user. But the data (strings) were being displayed with wierd corrupt characters because the input in the form on the other page was in French. I overcame this problem by using the following function which converters a string to the correct charset. HOWEVER, obviously the better solution is to convert it FIRST and then send it to the database. Now here's the code to convert a string retrieved from a DB to the appropriate charset:
Function ConvertFromUTF8(sIn) Dim oIn: Set oIn = CreateObject("ADODB.Stream") oIn.Open oIn.CharSet = "WIndows-1252" oIn.WriteText sIn oIn.Position = 0 oIn.CharSet = "UTF-8" ConvertFromUTF8 = oIn.ReadText oIn.Close End Function
I got this function from here: Classic ASP - How to convert a UTF-8 string to UCS-2?
Now my question is, what function do I use to convert strings beforehand and then send them to the database, so that when I retrieve them they will be good-to-go?
Tried Paul's Method:
So there's page 1, and page 2. Page 1 contains a form which, when submitted, sends the string to the DB which is then retrieved in page 2. I tried Paul's solution by removing the function ConvertFromUTF8 and leaving it to as it was before (it returned wierd mangolian characters). After that, I added the following line on top of Page 1 as well as Page 2.
<%@LANGUAGE="VBSCRIPT" CODEPAGE="65001"%>
I also have the following on both of the pages:
Response.CodePage = 65001 Response.CharSet = "UTF-8"
But it didn't work :(
Edit: it works!, thank you so much everyone for your help! All I needed to do was add "CodePage = 65001" on top of Page 3 (which I didn't even talk about), where the writing to the DB part was happening.