What is a regular expression for removing spaces between uppercase letters, but keeps spaces between words?

11,573

When the regex matches the first time (on "A B"), this part of the string in consumed by the engine, so it is not matched again, even though your regex has the global ('g') flag.

You could achieve the expected result by using a positive lookahead ((?=PATTERN)) instead, that won't consume the match:

value = "Hello I B M"
value = value.replace(/([A-Z])\s(?=[A-Z])/g, '$1')
console.log(value) // Prints "Hello IBM"

To make it not remove the space if the next uppercase letter is the first in a word, you can increment the lookahead pattern with using a word boundary \b to make that restriction:

value = "Hello I B M Dude"
value = value.replace(/([A-Z])\s(?=[A-Z]\b)/g, '$1')
console.log(value) // Prints "Hello IBM Dude"

Note: As @CasimirHyppolite noted, the following letter has to be made optional, or the second regex won't work if the last character of the string is uppercase. Thus, the pattern ([^A-Za-z]|$), which can be read as "not a letter, or the end of the string".

Edit: Simplify lookahead from (?=[A-Z]([^A-Za-z]|$)) to (?=[A-Z]\b) as suggested by @hwnd

Share:
11,573
Steven Yuan
Author by

Steven Yuan

Updated on June 26, 2022

Comments

  • Steven Yuan
    Steven Yuan 3 months

    For example, if I have a string like "Hello I B M", how do I detect the space between the uppercase letters but not between the "o" and the "I"?

    Basically "Hello I B M" should resolve to "Hello IBM"

    So far, I have this:

    value = "Hello I B M"
    value = value.replace(/([A-Z])\s([A-Z])/g, '$1$2')
    

    But it only replaces the first instance of a space between two uppercase letters like: "Hello IB M"

    --EDIT--

    Solution Part 1:

     value = value.replace(/([A-Z])\s(?=[A-Z])/g, '$1')
    

    Thanks to Renato for the first part of the solution! Just found out if there is a capitalized word AFTER an uppercase letter, it swallows that space as well. How do we preserver the space there?

    So "Hello I B M Dude" becomes "Hello IBMDude" instead of "Hello IBM Dude"