A pythonic way to insert a space before capital letters
Solution 1
You could try:
>>> re.sub(r"(\w)([A-Z])", r"\1 \2", "WordWordWord")
'Word Word Word'
Solution 2
If there are consecutive capitals, then Gregs result could not be what you look for, since the \w consumes the caracter in front of the captial letter to be replaced.
>>> re.sub(r"(\w)([A-Z])", r"\1 \2", "WordWordWWWWWWWord")
'Word Word WW WW WW Word'
A look-behind would solve this:
>>> re.sub(r"(?<=\w)([A-Z])", r" \1", "WordWordWWWWWWWord")
'Word Word W W W W W W Word'
Solution 3
Perhaps shorter:
>>> re.sub(r"\B([A-Z])", r" \1", "DoIThinkThisIsABetterAnswer?")
Solution 4
Have a look at my answer on .NET - How can you split a “caps” delimited string into an array?
Edit: Maybe better to include it here.
re.sub(r'([a-z](?=[A-Z])|[A-Z](?=[A-Z][a-z]))', r'\1 ', text)
For example:
"SimpleHTTPServer" => ["Simple", "HTTP", "Server"]
Solution 5
Maybe you would be interested in one-liner implementation without using regexp:
''.join(' ' + char if char.isupper() else char.strip() for char in text).strip()
Related videos on Youtube
Dave
Lifelong computer enthusiast turned Software developer. Recently emerged from the command-line C world into web systems under .NET. I'm greatly enjoying the fact I haven't had a single pointer-related error in over a year.
Updated on July 09, 2022Comments
-
Dave almost 2 years
I've got a file whose format I'm altering via a python script. I have several camel cased strings in this file where I just want to insert a single space before the capital letter - so "WordWordWord" becomes "Word Word Word".
My limited regex experience just stalled out on me - can someone think of a decent regex to do this, or (better yet) is there a more pythonic way to do this that I'm missing?
-
hayalci over 15 yearsDan's answer is better and simpler :)
-
Ishbir over 15 years@hayalci: re.sub('([A-Z])', r' \1', 'Really?')
-
Ishbir over 15 yearsre.sub('([A-Z])', r' \1', "Do we want a space before the D's of this phrase?")
-
Ishbir over 15 yearsre.sub(r"(\w)([A-Z])", r"\1 \2", "SorryIThinkYouMissedASpot")
-
Ishbir over 15 yearsYour answer is probably what Electrons_Ahoy really wants; however, based on the phrasing of their question, it's not.
-
Dan Lenski over 15 yearsAh, yes, good point. Looks like your's and Leonhard's solutions handle this correctly.
-
Brian over 15 yearsThis has the same problem as Dan's - you'll get extra spaces before capitals even if they aren't needed.
-
Tomalak over 15 yearsAs a small improvement, [[:upper:]] should be used instead of [A-Z].
-
user2471801 over 15 yearsTrue, i've edited it to add a flag... I admit it's a little cumbersome, but may be easier to remember than regex.
-
sedavidw over 12 years@Tomalak,
[[:upper:]]
is not supported by Python. It is a POSIX bracket expression. -
Fight Fire With Fire about 9 yearsbut thank you for sharing this one, this is an awesome answer!
-
ArtOfWarfare over 6 yearsFor anyone wondering,
\B
is "Not word boundary". So it's not inserting spaces where there's already a space. -
blobbymatt about 5 yearsFor those like me, make sure you - import re
-
karthikeyan over 4 yearsElegant answer... Thanks a lot