A pythonic way to insert a space before capital letters

28,994

Solution 1

You could try:

>>> re.sub(r"(\w)([A-Z])", r"\1 \2", "WordWordWord")
'Word Word Word'

Solution 2

If there are consecutive capitals, then Gregs result could not be what you look for, since the \w consumes the caracter in front of the captial letter to be replaced.

>>> re.sub(r"(\w)([A-Z])", r"\1 \2", "WordWordWWWWWWWord")
'Word Word WW WW WW Word'

A look-behind would solve this:

>>> re.sub(r"(?<=\w)([A-Z])", r" \1", "WordWordWWWWWWWord")
'Word Word W W W W W W Word'

Solution 3

Perhaps shorter:

>>> re.sub(r"\B([A-Z])", r" \1", "DoIThinkThisIsABetterAnswer?")

Solution 4

Have a look at my answer on .NET - How can you split a “caps” delimited string into an array?

Edit: Maybe better to include it here.

re.sub(r'([a-z](?=[A-Z])|[A-Z](?=[A-Z][a-z]))', r'\1 ', text)

For example:

"SimpleHTTPServer" => ["Simple", "HTTP", "Server"]

Solution 5

Maybe you would be interested in one-liner implementation without using regexp:

''.join(' ' + char if char.isupper() else char.strip() for char in text).strip()
Share:
28,994

Related videos on Youtube

Dave
Author by

Dave

Lifelong computer enthusiast turned Software developer. Recently emerged from the command-line C world into web systems under .NET. I'm greatly enjoying the fact I haven't had a single pointer-related error in over a year.

Updated on July 09, 2022

Comments

  • Dave
    Dave almost 2 years

    I've got a file whose format I'm altering via a python script. I have several camel cased strings in this file where I just want to insert a single space before the capital letter - so "WordWordWord" becomes "Word Word Word".

    My limited regex experience just stalled out on me - can someone think of a decent regex to do this, or (better yet) is there a more pythonic way to do this that I'm missing?

  • hayalci
    hayalci over 15 years
    Dan's answer is better and simpler :)
  • Ishbir
    Ishbir over 15 years
    @hayalci: re.sub('([A-Z])', r' \1', 'Really?')
  • Ishbir
    Ishbir over 15 years
    re.sub('([A-Z])', r' \1', "Do we want a space before the D's of this phrase?")
  • Ishbir
    Ishbir over 15 years
    re.sub(r"(\w)([A-Z])", r"\1 \2", "SorryIThinkYouMissedASpot")
  • Ishbir
    Ishbir over 15 years
    Your answer is probably what Electrons_Ahoy really wants; however, based on the phrasing of their question, it's not.
  • Dan Lenski
    Dan Lenski over 15 years
    Ah, yes, good point. Looks like your's and Leonhard's solutions handle this correctly.
  • Brian
    Brian over 15 years
    This has the same problem as Dan's - you'll get extra spaces before capitals even if they aren't needed.
  • Tomalak
    Tomalak over 15 years
    As a small improvement, [[:upper:]] should be used instead of [A-Z].
  • user2471801
    user2471801 over 15 years
    True, i've edited it to add a flag... I admit it's a little cumbersome, but may be easier to remember than regex.
  • sedavidw
    sedavidw over 12 years
    @Tomalak, [[:upper:]] is not supported by Python. It is a POSIX bracket expression.
  • Fight Fire With Fire
    Fight Fire With Fire about 9 years
    but thank you for sharing this one, this is an awesome answer!
  • ArtOfWarfare
    ArtOfWarfare over 6 years
    For anyone wondering, \B is "Not word boundary". So it's not inserting spaces where there's already a space.
  • blobbymatt
    blobbymatt about 5 years
    For those like me, make sure you - import re
  • karthikeyan
    karthikeyan over 4 years
    Elegant answer... Thanks a lot