Setting nodeValue of text node in Javascript when string contains html entities

11,215

Solution 1

You need to use Javascript escapes for the Unicode characters:

node.nodeValue="string with \uxxxx sort of characters"

Solution 2

The reason this is happening is because the & in your string is being expanded into the ampersand entity by the browser. To get around this, you'll need to convert the entities yourself.

<html>
<body>
    <div id="test"> </div>
</body>

<script type="text/javascript">

onload = function()
{
    var node = document.getElementById( 'test' );
    node.firstChild.nodeValue = convertEntities( 'Some &#187; entities &#171; and some &#187; more entities &#171;' );
}

function convertEntities( text )
{
    var matches = text.match( /\&\#(\d+);/g );

    for ( var i = 0; i < matches.length; i++ )
    {
        console.log( "Replacing: " + matches[i] );
        console.log( "With: " + convertEntity( matches[i] ) );
        text = text.replace( matches[i], convertEntity( matches[i] ) );
    }

    return text;

    function convertEntity( ent )
    {
        var num = parseInt(ent.replace(/\D/g, ''), 16);
        var esc = ((num < 16) ? '0' : '') + num.toString(16);
        return String.fromCharCode( esc );
    }
}

</script>

</html>

Solution 3

As noted in other answers, I need to replace html encoded entities with javascript encoded ones. Starting from BaileyP's answer, I've made this:

function convertEntities( text )
{
    var ret = text.replace( /\&\#(\d+);/g, function ( ent, captureGroup )
    {
        var num = parseInt( captureGroup );
        return String.fromCharCode( num );
    });
    return ret;
}
Share:
11,215
Slartibartfast
Author by

Slartibartfast

Updated on June 12, 2022

Comments

  • Slartibartfast
    Slartibartfast almost 2 years

    When I set a value of a text node with

    node.nodeValue="string with &#xxxx; sort of characters"
    

    ampersand gets escaped. Is there an easy way to do this?

  • Bjorn
    Bjorn over 9 years
    Just FYI, this requires having a meta tag in your HTML document that sets the charset to utf-8 otherwise &nbsp; will be decoded as junk.