Are preg_match() and preg_replace() slow?

17,324

Solution 1

As Mike Brant said in his answer: There's nothing wrong with using any of the preg_* functions, if you need them.
You want to know if it's a good idea to have something like 20 preg_match calls in a single file, well, honestly: I'd say that's too many. I've often stated that "if your solution to a problem relies on more than 3 regex's at any given time, you're part of the problem". I have occasionally sinned against my own mantra, though.

If you are using 20 preg_match calls, chances are you can halve that number simply by having a closer look at the actual regular expressions. Regex's, especially the Perl regex, are incredibly powerful, and are well worth the time to get to know them. The reason why they tend to be slower is simply because the regex has to be parsed, and "translated" to a considerable number of branches and loops at some low level. If, say, you want to replace all lower-case a's with an upper-case char, you could use a regular expression, sure, but in PHP this would look like this:

preg_replace('/a/','A',$string);

Look at the expression, the first argument: it's a string that is passed as an argument. This string will be parsed (when parsing, the delimiters are checked, a match string is created and then the string is iterated, each char is compared to the pattern (in this case a), and if the substring matches, it's replaced.
Seems like a bit of a hasstle, especially considering that the last step (comparing substrings and replace matches) is all we really want.

$string = str_replace('a','A',$string);

Does just that, without the additional checks performed when a regular expression is parsed and validated.
Don't forget that preg_match also constructs an array of matches, and constructing an array isn't free either.

In short: regex's are slower because the expression is parsed, validated and finally translated into a set of simple, low-level instructions.

Note that, in some cases people use explode and implode for string manipulations. This, too, creates an array which is -again- not free. Considering that you're imploding that very same array shortly thereafter. Perhaps another option is more desirable (and in some cases preg_replace can be faster here).
Basically: regex's need additional processing, that simple string functions don't require. But when in doubt, there's only 1 way to be absolutely sure: set up a test script...

Solution 2

Don't worry about optimization unless you have a problem.

Don't look for areas of optimization without measuring with something like XDebug (http://xdebug.org).

If your code takes 100ms to run with preg_match() and 110ms via some other method, do you really care about the difference?

Write for correctness and clarity first, then consider speed.

Solution 3

It really depends on your use case. There is nothing inherently "bad" about using regex. Sometimes it is your only available solution to a particular problem. However, there are times when simple string manipulation functions will work just fine. These tend to be faster than the preg* functions, so if you run into cases where you have a script that is run very frequently and/or has a large number of string manipulations to be performed, the impact of using regex can begun to be felt.

As is the case for anything, you should test in your application and environment and decide what works best for you.

Solution 4

Check how much time it needs (display times when STARTED and ENDED):

var_dump( microtime(true) );

//...............  your function executions here.............

var_dump( microtime(true) );

Solution 5

Depends on what you're doing. For complex regex just go with the preg_ functions, if you need simple substitutions or similar, go with other, more specific functions like str_replace(), strpos(), strstr()...

The web is full of discussions, like http://www.simplemachines.org/community/index.php?topic=175031.0

Share:
17,324

Related videos on Youtube

Jasko Koyn
Author by

Jasko Koyn

Updated on September 16, 2022

Comments

  • Jasko Koyn
    Jasko Koyn over 1 year

    I've been coding in PHP for a while and I keep reading that you should only use preg_match and preg_replace when you have to because it slows down performance. Why is this? Would it really be bad to use 20 preg_matches in one file instead of using another PHP function.

    • Marc B
      Marc B over 11 years
      regexes have to be compiled, strings parsed, etc... nothing WRONG with using a regex, but a lot of people abuse them by doing silly things like preg_match('/foo/', $bar) instead of strpos('foo', $bar) !== false
    • SDC
      SDC over 11 years
      The answer is: it depends on what "other PHP function" you had in mind. Some cases may be faster, others not. Also, speed is not always the most important factor. Regex may be the best tool for the job regardless of speed, or it may be the wrong tool for the job even if it runs quicker.
    • Jeyz Strife
      Jeyz Strife about 4 years
      In my case, this is so helpful as I use around 30+ preg_replace() before I render a page. I managed to cache my pages so I don't always have to iterate.