PHP: Best way to extract text within parenthesis?

97,087

Solution 1

i'd just do a regex and get it over with. unless you are doing enough iterations that it becomes a huge performance issue, it's just easier to code (and understand when you look back on it)

$text = 'ignore everything except this (text)';
preg_match('#\((.*?)\)#', $text, $match);
print $match[1];

Solution 2

So, actually, the code you posted doesn't work: substr()'s parameters are $string, $start and $length, and strpos()'s parameters are $haystack, $needle. Slightly modified:

$str = "ignore everything except this (text)";
$start  = strpos($str, '(');
$end    = strpos($str, ')', $start + 1);
$length = $end - $start;
$result = substr($str, $start + 1, $length - 1);

Some subtleties: I used $start + 1 in the offset parameter in order to help PHP out while doing the strpos() search on the second parenthesis; we increment $start one and reduce $length to exclude the parentheses from the match.

Also, there's no error checking in this code: you'll want to make sure $start and $end do not === false before performing the substr.

As for using strpos/substr versus regex; performance-wise, this code will beat a regular expression hands down. It's a little wordier though. I eat and breathe strpos/substr, so I don't mind this too much, but someone else may prefer the compactness of a regex.

Solution 3

Use a regular expression:

if( preg_match( '!\(([^\)]+)\)!', $text, $match ) )
    $text = $match[1];

Solution 4

i think this is the fastest way to get the words between the first parenthesis in a string.

$string = 'ignore everything except this (text)';
$string = explode(')', (explode('(', $string)[1]))[0];
echo $string;

Solution 5

The already posted regex solutions - \((.*?)\) and \(([^\)]+)\) - do not return the innermost strings between an open and close brackets. If a string is Text (abc(xyz 123) they both return a (abc(xyz 123) as a whole match, and not (xyz 123).

The pattern that matches substrings (use with preg_match to fetch the first and preg_match_all to fetch all occurrences) in parentheses without other open and close parentheses in between is, if the match should include parentheses:

\([^()]*\)

Or, you want to get values without parentheses:

\(([^()]*)\)        // get Group 1 values after a successful call to preg_match_all, see code below
\(\K[^()]*(?=\))    // this and the one below get the values without parentheses as whole matches 
(?<=\()[^()]*(?=\)) // less efficient, not recommended

Replace * with + if there must be at least 1 char between ( and ).

Details:

  • \( - an opening round bracket (must be escaped to denote a literal parenthesis as it is used outside a character class)
  • [^()]* - zero or more characters other than ( and ) (note these ( and ) do not have to be escaped inside a character class as inside it, ( and ) cannot be used to specify a grouping and are treated as literal parentheses)
  • \) - a closing round bracket (must be escaped to denote a literal parenthesis as it is used outside a character class).

The \(\K part in an alternative regex matches ( and omits from the match value (with the \K match reset operator). (?<=\() is a positive lookbehind that requires a ( to appear immediately to the left of the current location, but the ( is not added to the match value since lookbehind (lookaround) patterns are not consuming. (?=\() is a positive lookahead that requires a ) char to appear immediately to the right of the current location.

PHP code:

$fullString = 'ignore everything except this (text) and (that (text here))';
if (preg_match_all('~\(([^()]*)\)~', $fullString, $matches)) {
    print_r($matches[0]); // Get whole match values
    print_r($matches[1]); // Get Group 1 values
}

Output:

Array ( [0] => (text)  [1] => (text here) )
Array ( [0] => text    [1] => text here   )
Share:
97,087
Wilco
Author by

Wilco

Updated on July 08, 2022

Comments

  • Wilco
    Wilco almost 2 years

    What's the best/most efficient way to extract text set between parenthesis? Say I wanted to get the string "text" from the string "ignore everything except this (text)" in the most efficient manner possible.

    So far, the best I've come up with is this:

    $fullString = "ignore everything except this (text)";
    $start = strpos('(', $fullString);
    $end = strlen($fullString) - strpos(')', $fullString);
    
    $shortString = substr($fullString, $start, $end);
    

    Is there a better way to do this? I know in general using regex tends to be less efficient, but unless I can reduce the number of function calls, perhaps this would be the best approach? Thoughts?

  • Edward Z. Yang
    Edward Z. Yang over 15 years
    No, it isn't: . only matches a single character.
  • Owen
    Owen over 15 years
    not necessarily, ? is a lazy match. without it, a string like 'ignore (everything) except this (text)', the match would end up being 'everthing) except this (text'
  • Dimitry
    Dimitry over 15 years
    Good to know. Should avoid all those squared nots. E.g. /src="([^"]*)"/ now replaced with /src="(.*?)"/ :D
  • Mnebuerquo
    Mnebuerquo over 15 years
    It's good that you can "understand when you look back on it". Failing that, you've got some Stack Overflow comments to clarify it.
  • Tanj
    Tanj over 15 years
    the /src="([^"]*)"/ is more efficient than /src="(.*?)"/
  • Owen
    Owen over 15 years
    ya square nots are, the reason is ? makes the engine backtrack a lot, which is very expensive. the square nots will match "forward" in that sense. i prefer the ? notation though, so if performance isn't an issue i get lazy :)
  • Mike Castro Demaria
    Mike Castro Demaria over 9 years
    +1 but how do the same for [* and *] ? Because [] only maybe used on html for example.
  • Ravi
    Ravi almost 4 years
    If I want except the (text) then?
  • ftrotter
    ftrotter over 3 years
    Note that if you modify this code to use strrpos (starts from the back of the string) on the $end then it will correctly handle cases where there are parens within.. like (well this is (very) nice).