regular expression in javascript string split, browser compatibility issue

12,447

Solution 1

The reason your code is not working is because IE parses the HTML and makes the tags uppercase when you read it through innerHTML. For example, if you have HTML like this:

<div id='box'>
Hello<br>
World
</div>

And then you use this Javascript (in IE):

alert(document.getElementById('box').innerHTML);

You will get an alert box with this:

Hello<BR>World

Notice the <BR> is now uppercase. To fix this, just add the i flag in addition to the g flag to make the regex be case-insensitive and it will work as you expect.

Solution 2

Try this one:

/<br[^>]*>/gi

Solution 3

Instead of

/<br.*?>/

you could try

/<br[^>]*>/

i.e. matching "<br", followed by any characters other than '>', followed by '>'.

Share:
12,447
Walt Jones
Author by

Walt Jones

You can also find me at my blog, at github, twitter, or any one of a number of other places. Cheers!

Updated on July 03, 2022

Comments

  • Walt Jones
    Walt Jones almost 2 years

    I've been investigating this issue that only seems to get worse the more I dig deeper.

    I started innocently enough trying to use this expression to split a string on HTML 'br' tags:

    T = captions.innerHTML.split(/<br.*?>/g);
    

    This works in every browser (FF, Safari, Chrome), except IE7 and IE8 with example input text like this:

    is invariably subjective. <br /> 
    The less frequently used warnings (Probably/Possibly) <br /> 
    

    Please note that the example text contains a space before the '/', and precedes a new line.

    Both of the following will match all HTML tags in every browser:

    T = captions.innerHTML.split(/<.*?>/g);
    T = captions.innerHTML.split(/<.+?>/g);
    

    However, surprisingly (to me at least), this does not work in FF and Chrome:

    T = captions.innerHTML.split(/<br.+?>/g);
    

    Edit:

    This (suggested several times in the responses below,) does not work on IE 7 or 8:

    T = captions.innerHTML.split(/<br[^>]*>/g);
    

    (It did work on Chrome and FF.)

    My question is: does anyone know an expression that works in all current browsers to match the 'br' tags above (but not other HTML tags). And can anyone confirm that the last example above should be a valid match since two characters are present in the example text before the '>'.

    PS - my doctype is HTML transitional.

    Edit:

    I think I have evidence this is specific to the string.split() behavior on IE, and not regex in general. You have to use split() to see this issue. I have also found a test matrix that shows a failure rate of about 30% for split() test cases when I ran it on IE. The same tests passed 100% on FF and Chrome:

    http://stevenlevithan.com/demo/split.cfm

    So far, I have still not found a solution for IE, and the library provided by the author of that test matrix did not fix this case.