Trim &nbsp values in javascript

12,455

Solution 1

  becomes a non-break-space character, \u00a0. JavaScript's String#trim is supposed to remove those, but historically browser implementations have been a bit buggy in that regard. I thought those issues had been resolved in modern ones, but...

If you're running into browsers that don't implement it correctly, you can work around that with a regular expression:

text = editorData.replace(/(?:^[\s\u00a0]+)|(?:[\s\u00a0]+$)/g, '');

That says to replace all whitespace or non-break-space chars at the beginning and end with nothing.

But having seen your comment:

When I run this piece of code separately, it is working fine for me. But in application its failing.

...that may not be it.

Alternately, you could remove the   markup before converting to text:

html = html.replace(/(?:^(?: )+)|(?:(?: )+$)/g, '');
var editorData = $('<div/>').html(html).text();
text = editorData.trim();    

That removes any &nbsp;s at the beginning or end prior to converting the markup to text.

Solution 2

To easiest way to trim non-breaking spaces from a string is

html.replace(/&nbsp;/g,' ').trim()

Solution 3

If you are using jQuery you can use jQuery.trim()

function removes all newlines, spaces (including non-breaking spaces), and tabs from the beginning and end of the supplied string. source

Share:
12,455
PSR
Author by

PSR

I am a web developer works mostly with Microsoft technologies. I like to blog about code, web and technology. My blog is at http://sridharpasham.com. You can find me in twitter at @pashsridhar . You can know more about my professional background at linkedin.com/in/sridharpasham

Updated on June 21, 2022

Comments

  • PSR
    PSR almost 2 years

    I am trying to trim the text which I get from kendo editor like this.

    var html = "&nbsp; T &nbsp;"; // This sample text I get from Kendo editor
                console.log("Actual :" + html + ":");
                var text = "";
                try {
                    // html decode
                    var editorData = $('<div/>').html(html).text();
                    text = editorData.trim();                    
                    console.log("After trim :" + text + ":");
                }
                catch (e) {
                    console.log("exception");
                    text = html;
                }
    

    This code is in seperate js file ( generated from typescript). When the page loads the trimming is not working. But when I run the same code in developer tools console window its working. Why it is not working?

    Adding typescript code

     const html: string = $(selector).data("kendoEditor").value();
            console.log("Actual :" + html + ":");
            let text: string = "";
            try {
                // html decode
                var editorData = $('<div/>').html(html).text();
                text = editorData.trim();
                console.log("After trim :" + text + ":");
            }
            catch (e) {
                console.log("exception");
                text = html;
            }
    
  • PSR
    PSR almost 8 years
    Your editorData.replace solution is working for me. Thanks.
  • T.J. Crowder
    T.J. Crowder almost 8 years
    @Sree: Great! I'm sorry to hear browsers are still getting this wrong, but glad that helped. :-)
  • Bekim Bacaj
    Bekim Bacaj almost 8 years
    you are using a last resolution RegExp method which is dirty, hard read & maintain, and requires explicit encodings of a targeted type of whitespaces. The solution you just down-voted - is native, universal, generic and clean; easy to implement, maintain and customize for browsers that don't support the trim method natively.
  • T.J. Crowder
    T.J. Crowder almost 8 years
    @BekimBacaj: You clearly haven't read (or at least understood) my answer above, and clearly haven't understood the comments on your answer. Well, unfortunately, nothing more I can do about that.
  • Bekim Bacaj
    Bekim Bacaj almost 8 years
    The terminology is being used and misused in all levels of publications. This " " is a whitespace, or to be even more precise, an interval, this should be trimmed. This "&nbsp;" however, is not a whitespace or interval, that's a whitespace character, or to be more precise that's a white character equal to T or any other nonwhite character. Therefore it should not be treated as whitespace interval, and therefore not to be trimmed.
  • T.J. Crowder
    T.J. Crowder almost 8 years
    @BekimBacaj: For crying out loud. Just read the specification. I linked it with that very purpose in mind. From the trim link: "Let T be a String value that is a copy of S with both leading and trailing white space removed. The definition of white space is the union of WhiteSpace and LineTerminator." From the Table 32 link: "The ECMAScript white space code points are listed in Table 32:...U+00A0 NO-BREAK SPACE <NBSP>..." Thus: trim is supposed to remove no-break-space. And it does, on browsers where it isn't broken. As I said above.
  • Bekim Bacaj
    Bekim Bacaj almost 8 years
    that's wrong interpretation "&nbsp;" is not the same as \u00a0.
  • T.J. Crowder
    T.J. Crowder almost 8 years
    @Bekim: To quote you: "Wrong again!" Here's another specification for you: w3.org/TR/html5/syntax.html#named-character-references, which says quite clearly that the &nbsp; is U+00A0. And proof that it's actually true in the real world: jsfiddle.net/nyd3yb0z And with that, I'm done. If you feel the need to post further FUD, please cite reliable references.
  • T.J. Crowder
    T.J. Crowder almost 8 years
    @Bekim: What utter nonsense.
  • mikewasmike
    mikewasmike over 6 years
    This will remove spaces in the middle of string
  • Varun
    Varun over 4 years
    @T.J.Crowder, thanks for this. it works for me. how would you change it to make it work so that this: "&nbsp;&nbsp; &nbsp; hello &nbsp; world &nbsp;&nbsp; &nbsp;" becomes: "hello &nbsp; world". Thanks!
  • T.J. Crowder
    T.J. Crowder over 4 years
    @Varun - .replace(/(?:^(?:\&nbsp;|\s)+)|(?:(?:\&nbsp;|\s)+$)/g, "") should do it. :-)