Caveats Encoding a C# string to a Javascript string
Solution 1
(.net 4) You can;
System.Web.HttpUtility.JavaScriptStringEncode(@"aa\bb ""cc"" dd\tee", true);
==
"aa\\bb \"cc\" dd\\tee"
Solution 2
It's my understanding that you do have to be careful, as JavaScript is not UTF-16, rather, it's UCS-2 which I believe is a subset of UTF-16. What this means for you, is that any character that is represented than a higher code point of 2 bytes (0xFFFF) could give you problems in JavaScript.
In summary, under the covers, the engine may use UTF-16, but it only exposes UCS-2 like methods.
Great article on the issue: http://mathiasbynens.be/notes/javascript-encoding
Related videos on Youtube
Machado
I'm a software developer since I was 12, when I got my first AT-286 computer. Nowadays I work with software development for financial markets, but I already worked with several technologies (from PalmOS to Interactive Digital TV, passing through common Web technologies). Some of my main interests are software architecture and programming.
Updated on June 04, 2022Comments
-
Machado almost 2 years
I'm trying to write a custom Javascript MVC3 Helper class foe my project, and one of the methods is supposed to escape C# strings to Javascript strings.
I know C# strings are UTF-16 encoded, and Javascript strings also seem to be UTF-16. No problem here.
I know some characters like backslash, single quotes or double quotes must be backslash-escaped on Javascript so:
\ becomes \\ ' becomes \' " becomes \"
Is there any other caveat I must be aware of before writing my conversion method ?
EDIT: Great answers so far, I'm adding some references from the answers in the question to help others in the future.
Alex K. suggested using
System.Web.HttpUtility.JavaScriptStringEncode
, which I marked as the right answer for me, because I'm using .Net 4. But this function is not available to previous .Net versions, so I'm adding some other resources here:CR becomes \r // Javascript string cannot be broke into more than 1 line LF becomes \n // Javascript string cannot be broke into more than 1 line TAB becomes \t Control characters must be Hex-Escaped
JP Richardson gave an interesting link informing that Javascript uses UCS-2, which is a subset of UTF-16, but how to encode this correctly is an entirely new question.
LukeH on the comments below reminded the CR, LF and TAB chars, and that reminded me of the control chars (BEEP, NULL, ACK, etc...).
-
LukeH about 12 yearsDon't forget to encode newlines, tabs and any other special chars.
JavaScriptStringEncode
, as suggested by Alex, will handle that for you.
-
-
Machado about 12 yearsNice! I'm using MVC3 with .Net 4, so this is very useful!
-
Machado about 12 yearsSo, how could we safely transform the C# UTF-16 into UCS-2 in order to encode the string the right way ?
-
Matt R about 10 yearsWhat's the solution for users of .net version < 4?
-
Casey almost 9 yearsWhy would you choose to do it this way?
-
Gqqnbig about 7 yearsMy string is not url, why do I use UrlEncode. It seems silly. But I believe it will work.