How do I get the unicode/hex representation of a symbol out of the HTML using JavaScript/jQuery?
Solution 1
Using mostly plain JavaScript, you should be able to do:
function entityForSymbolInContainer(selector) {
var code = $(selector).text().charCodeAt(0);
var codeHex = code.toString(16).toUpperCase();
while (codeHex.length < 4) {
codeHex = "0" + codeHex;
}
return "&#x" + codeHex + ";";
}
Here's an example: http://jsfiddle.net/btWur/
Solution 2
charCodeAt
will get you the decimal value of the string:
"α".charCodeAt(0); //returns 945
0x03b1 === 945; //returns true
toString
will then get the hex string
(945).toString(16); // returns "3b1"
(Confirmed to work in IE9 and Chrome)
Solution 3
If you would try to convert Unicode character out of BMP (basic multilingual plane) in ways above - you are up for a nasty surprise. Characters out of BMP are encoded as multiple UTF16
values for example:
"🔒".length
= 2 (one part for shackle one part for lock base :) )
so "🔒".charCodeAt(0)
will give you 55357
which is only 'half' of number while "🔒".charCodeAt(1)
will give you 56594
which is the other half.
To get char codes for those values you might wanna use use following string extension function
String.prototype.charCodeUTF32 = function(){
return ((((this.charCodeAt(0)-0xD800)*0x400) + (this.charCodeAt(1)-0xDC00) + 0x10000));
};
you can also use it like this
"&#x"+("🔒".charCodeUTF32()).toString(16)+";"
to get html hex codes.
Hope this saves you some time.
Related videos on Youtube
Comments
-
Hristo almost 2 years
Say I have an element like this...
<math xmlns="http://www.w3.org/1998/Math/MathML"> <mo class="symbol">α</mo> </math>
Is there a way to get the unicode/hex value of alpha
α
,α
, using JavaScript/jQuery? Something like...$('.symbol').text().unicode(); // I know unicode() doesn't exist $('.symbol').text().hex(); // I know hex() doesn't exist
I need
α
instead ofα
and it seems like anytime I insertα
into the DOM and try to retrieve it right away, it gets rendered and I can't getα
back; I just get α. -
Hristo almost 13 years@aroth... this looks awesome! i'm testing now
-
L0j1k almost 8 years+1 Thanks for saving us from this landmine! Checking the length of the character was the key for me.
-
kontur about 3 yearsGood insight, and note that not just emojis are beyond the BMP :) Your prototype enhancement should probably check the length first; for "UTF-8" strings the
this.charCodeAt(1)
with returnNaN
, and so will the entire function as a consequence; for "length === 2" chars it should just returncharCodeAt(0)
as such.