Convert innerHTML of a contenteditable text to normal string

28,162

Solution 1

Try using;

// for IE
document.getElementById('myinput').innerText

// for everyone else
document.getElementById('myinput').textContent

In terms of finding linebreaks, etc, consider;

el = document.getElementById('myinput');
var nodes = el.childNodes;
var text = '';

for(var i = 0; i < nodes.length; i++) {                        
    switch(nodes[i].nodeName) {
        case '#text'    : text = text + nodes[i].nodeValue;   break;
        case 'BR'       : text = text + '\n';      break;
    }
}
console.log(text);

Solution 2

Due to the fact this behaviour is not consistent in different browsers, you have to implement this yourself:

var convert = (function() {
    var convertElement = function(element) {
        switch(element.tagName) {
            case "BR": 
                return "\n";
            case "P": // fall through to DIV
            case "DIV": 
                return (element.previousSibling ? "\n" : "") 
                    + [].map.call(element.childNodes, convertElement).join("");
            default: 
                return element.textContent;
        }
    };

    return function(element) {
        return [].map.call(element.childNodes, convertElement).join("");
    };
})();

In action: http://jsfiddle.net/koyd8h59/1/

Of course you'll need to add your own code if you want to use <h1> and other block-level tags.

Share:
28,162
Basj
Author by

Basj

I work on R&amp;D involving Python, maths, machine learning, deep learning, data science, product design, and MacGyver solutions to complex problems. I love prototyping, building proofs-of-concept. For consulting/freelancing inquiries : [email protected]

Updated on May 28, 2020

Comments

  • Basj
    Basj almost 4 years

    I use a content-editable element :

    <span id="myinput" contenteditable="true">This is editable.</span>
    

    and

    document.getElementById('myinput').innerHTML
    

    to read its content from Javascript.

    But the result is :

    • "blah " => innerHTML = "blah &nbsp "

    • "bonjour\n bonsoir" => innerHTML = "bonjour<br>bonsoir" (Firefox) and innerHTML = "bonjour<div>bonsoir</div>" (Chrome)

    • maybe there are lots of other things that are translated into HTML...

    How to convert innerHTML into normal text?

    (i.e. in my 2 examples : "blah " and "bonjour\n bonsoir")

  • Alex K.
    Alex K. over 9 years
    Or just var text = el.textContent || el.innerText;
  • Bart
    Bart over 9 years
    .textContent does not convert <br> to \n
  • Basj
    Basj over 9 years
    innerText works on IE + Chrome, but not available on Firefox. textContent doesn't work, because the line breaks are lost : "bonjour\n bonsoir" => textContent = "bonjourbonsoir". How to deal with this problem ?
  • EvilEpidemic
    EvilEpidemic over 9 years
    updated answer, the only way i can see offhand is to determine the individual node type and handle appropriately. Probably could be done better with some regex wizadry
  • Basj
    Basj over 9 years
    doesn't work for Chrome ;) because it's not BR but <div>the new line</div> .... @EvilEpidemic
  • EvilEpidemic
    EvilEpidemic over 9 years
    if you see Barts answer below he demonstrates a method on how to catch that case
  • Martin Krauskopf
    Martin Krauskopf almost 4 years
    Thanks for the answer. I had the same issue. I have to tweak it a bit further. When the user presses the Enter in Chrome, the engine enters <div><br/><div>. So that needs to be handled separately. See the TypeScript Gist here.
  • Martin Krauskopf
    Martin Krauskopf almost 4 years
    Yup, that Bart's answer provides the correct solution (with the tweak). Unfortunately, Firefox and Chrome (Edge uses now Chromium) handle newlines (Enter pressed by the user) differently and innnerText only approximates text content as the user sees it. And in Chrome such approximation doubles the number of new lines in most of the cases (bug in Chromium?). So we have to parse the text from HTML manually.