Split camelCase word into words with php preg_match (Regular Expression)
Solution 1
You can also use preg_match_all
as:
preg_match_all('/((?:^|[A-Z])[a-z]+)/',$str,$matches);
Explanation:
( - Start of capturing parenthesis.
(?: - Start of non-capturing parenthesis.
^ - Start anchor.
| - Alternation.
[A-Z] - Any one capital letter.
) - End of non-capturing parenthesis.
[a-z]+ - one ore more lowercase letter.
) - End of capturing parenthesis.
Solution 2
You can use preg_split
as:
$arr = preg_split('/(?=[A-Z])/',$str);
I'm basically splitting the input string just before the uppercase letter. The regex used (?=[A-Z])
matches the point just before a uppercase letter.
Solution 3
I know that this is an old question with an accepted answer, but IMHO there is a better solution:
<?php // test.php Rev:20140412_0800
$ccWord = 'NewNASAModule';
$re = '/(?#! splitCamelCase Rev:20140412)
# Split camelCase "words". Two global alternatives. Either g1of2:
(?<=[a-z]) # Position is after a lowercase,
(?=[A-Z]) # and before an uppercase letter.
| (?<=[A-Z]) # Or g2of2; Position is after uppercase,
(?=[A-Z][a-z]) # and before upper-then-lower case.
/x';
$a = preg_split($re, $ccWord);
$count = count($a);
for ($i = 0; $i < $count; ++$i) {
printf("Word %d of %d = \"%s\"\n",
$i + 1, $count, $a[$i]);
}
?>
Note that this regex, (like codaddict's '/(?=[A-Z])/'
solution - which works like a charm for well formed camelCase words), matches only a position within the string and consumes no text at all. This solution has the additional benefit that it also works correctly for not-so-well-formed pseudo-camelcase words such as: StartsWithCap
and: hasConsecutiveCAPS
.
Input:
oneTwoThreeFour
StartsWithCap
hasConsecutiveCAPS
NewNASAModule
Output:
Word 1 of 4 = "one"
Word 2 of 4 = "Two"
Word 3 of 4 = "Three"
Word 4 of 4 = "Four"
Word 1 of 3 = "Starts"
Word 2 of 3 = "With"
Word 3 of 3 = "Cap"
Word 1 of 3 = "has"
Word 2 of 3 = "Consecutive"
Word 3 of 3 = "CAPS"
Word 1 of 3 = "New"
Word 2 of 3 = "NASA"
Word 3 of 3 = "Module"
Edited: 2014-04-12: Modified regex, script and test data to correctly split: "NewNASAModule"
case (in response to rr's comment).
Solution 4
While ridgerunner's answer works great, it seems not to work with all-caps substrings that appear in the middle of sentence. I use following and it seems to deal with these just alright:
function splitCamelCase($input)
{
return preg_split(
'/(^[^A-Z]+|[A-Z][^A-Z]+)/',
$input,
-1, /* no limit for replacement count */
PREG_SPLIT_NO_EMPTY /*don't return empty elements*/
| PREG_SPLIT_DELIM_CAPTURE /*don't strip anything from output array*/
);
}
Some test cases:
assert(splitCamelCase('lowHigh') == ['low', 'High']);
assert(splitCamelCase('WarriorPrincess') == ['Warrior', 'Princess']);
assert(splitCamelCase('SupportSEELE') == ['Support', 'SEELE']);
assert(splitCamelCase('LaunchFLEIAModule') == ['Launch', 'FLEIA', 'Module']);
assert(splitCamelCase('anotherNASATrip') == ['another', 'NASA', 'Trip']);
Solution 5
A functionized version of @ridgerunner's answer.
/**
* Converts camelCase string to have spaces between each.
* @param $camelCaseString
* @return string
*/
function fromCamelCase($camelCaseString) {
$re = '/(?<=[a-z])(?=[A-Z])/x';
$a = preg_split($re, $camelCaseString);
return join($a, " " );
}
CodeChap
Updated on July 05, 2022Comments
-
CodeChap almost 2 years
How would I go about splitting the word:
oneTwoThreeFour
into an array so that I can get:
one Two Three Four
with
preg_match
?I tired this but it just gives the whole word
$words = preg_match("/[a-zA-Z]*(?:[a-z][a-zA-Z]*[A-Z]|[A-Z][a-zA-Z]*[a-z])[a-zA-Z]*\b/", $string, $matches)`;
-
Anil almost 11 yearsThis is a much better solution, works first time (others added blank values to the array, this one is perfect! Thanks! +1
-
Daniel Rhodes almost 11 yearsoops this will probably fail on the CONSECUTIVE CAPS issue
-
Aaron J Lang over 10 yearsWouldn't the non-capturing group cause the result to be [one, wo, hree, our]?
-
Eli Gassert about 10 years@AaronJLang no, because the outer parentheses capture the WHOLE group, including the sub-group. It's a sub-group that he doesn't want to clutter the $matches collection.
-
rr- about 10 yearsThere seems to be a problem with strings like
NewNASAModule
(outputs:[New, NASAModule]
; I'd expect[New, NASA, Module]
) -
ridgerunner about 10 years@rr - Yes you are correct. See my other updated answer which splits:
NewNASAModule
correctly: RegEx to split camelCase or TitleCase (advanced) -
Zack Morris over 8 yearsThis failed for me with "TestID" using: "preg_match_all('/((?:^|[A-Z])[a-z]+)/', $key, $matches); die(implode(' ', $matches[0]));" because it doesn't like the CONSECUTIVE CAPS issue. I needed to split case changes with spaces and @blak3r's solution worked for me: stackoverflow.com/a/17122207/539149
-
Maciej Sz over 7 yearsBetter solution for strings like
HTMLParser
that will work: stackoverflow.com/a/6572999/1697320. -
benjaminhull about 7 yearsNice and lean - always prefer it this way.
-
cartbeforehorse about 6 yearsAs stipulated by @TarranJones (although not articulated too clearly), you don't need the outer-parenthesis. A matching string of
'/(?:^|[A-Z])[a-z]+/'
would suffice to produce one array (instead of two). This is becausepreg_match_all()
automatically captures all instances of the match, without you having to specifically stipulate it. -
Kobi about 5 years@jbobbins - Thank, updated. ideone expired old examples at some point, so many old examples are still broken.
-
jbobbins about 5 years@Kobi thanks. just so you're aware, I pasted the assertion text from the post by rr- and the ones with multiple caps together don't work. regex101.com/r/kNZfEI/2
-
Onkeltem over 4 yearsIt doesn't cover cases with digits. For some reason other repliers also ignore this basic fact. E.g. "Css3Transform" or alike