Performance of FOR vs FOREACH in PHP

121,828

Solution 1

My personal opinion is to use what makes sense in the context. Personally I almost never use for for array traversal. I use it for other types of iteration, but foreach is just too easy... The time difference is going to be minimal in most cases.

The big thing to watch for is:

for ($i = 0; $i < count($array); $i++) {

That's an expensive loop, since it calls count on every single iteration. So long as you're not doing that, I don't think it really matters...

As for the reference making a difference, PHP uses copy-on-write, so if you don't write to the array, there will be relatively little overhead while looping. However, if you start modifying the array within the array, that's where you'll start seeing differences between them (since one will need to copy the entire array, and the reference can just modify inline)...

As for the iterators, foreach is equivalent to:

$it->rewind();
while ($it->valid()) {
    $key = $it->key();     // If using the $key => $value syntax
    $value = $it->current();

    // Contents of loop in here

    $it->next();
}

As far as there being faster ways to iterate, it really depends on the problem. But I really need to ask, why? I understand wanting to make things more efficient, but I think you're wasting your time for a micro-optimization. Remember, Premature Optimization Is The Root Of All Evil...

Edit: Based upon the comment, I decided to do a quick benchmark run...

$a = array();
for ($i = 0; $i < 10000; $i++) {
    $a[] = $i;
}

$start = microtime(true);
foreach ($a as $k => $v) {
    $a[$k] = $v + 1;
}
echo "Completed in ", microtime(true) - $start, " Seconds\n";

$start = microtime(true);
foreach ($a as $k => &$v) {
    $v = $v + 1;
}
echo "Completed in ", microtime(true) - $start, " Seconds\n";

$start = microtime(true);
foreach ($a as $k => $v) {}
echo "Completed in ", microtime(true) - $start, " Seconds\n";

$start = microtime(true);
foreach ($a as $k => &$v) {}    
echo "Completed in ", microtime(true) - $start, " Seconds\n";

And the results:

Completed in 0.0073502063751221 Seconds
Completed in 0.0019769668579102 Seconds
Completed in 0.0011849403381348 Seconds
Completed in 0.00111985206604 Seconds

So if you're modifying the array in the loop, it's several times faster to use references...

And the overhead for just the reference is actually less than copying the array (this is on 5.3.2)... So it appears (on 5.3.2 at least) as if references are significantly faster...

EDIT: Using PHP 8.0 I got the following:

Completed in 0.0005030632019043 Seconds
Completed in 0.00066304206848145 Seconds
Completed in 0.00016379356384277 Seconds
Completed in 0.00056815147399902 Seconds

Repeated this test numerous times and ranking results were consistent.

Solution 2

I'm not sure this is so surprising. Most people who code in PHP are not well versed in what PHP is actually doing at the bare metal. I'll state a few things, which will be true most of the time:

  1. If you're not modifying the variable, by-value is faster in PHP. This is because it's reference counted anyway and by-value gives it less to do. It knows the second you modify that ZVAL (PHP's internal data structure for most types), it will have to break it off in a straightforward way (copy it and forget about the other ZVAL). But you never modify it, so it doesn't matter. References make that more complicated with more bookkeeping it has to do to know what to do when you modify the variable. So if you're read-only, paradoxically it's better not the point that out with the &. I know, it's counter intuitive, but it's also true.

  2. Foreach isn't slow. And for simple iteration, the condition it's testing against — "am I at the end of this array" — is done using native code, not PHP opcodes. Even if it's APC cached opcodes, it's still slower than a bunch of native operations done at the bare metal.

  3. Using a for loop "for ($i=0; $i < count($x); $i++) is slow because of the count(), and the lack of PHP's ability (or really any interpreted language) to evaluate at parse time whether anything modifies the array. This prevents it from evaluating the count once.

  4. But even once you fix it with "$c=count($x); for ($i=0; $i<$c; $i++) the $i<$c is a bunch of Zend opcodes at best, as is the $i++. In the course of 100000 iterations, this can matter. Foreach knows at the native level what to do. No PHP opcodes needed to test the "am I at the end of this array" condition.

  5. What about the old school "while(list(" stuff? Well, using each(), current(), etc. are all going to involve at least 1 function call, which isn't slow, but not free. Yes, those are PHP opcodes again! So while + list + each has its costs as well.

For these reasons foreach is understandably the best option for simple iteration.

And don't forget, it's also the easiest to read, so it's win-win.

Solution 3

One thing to watch out for in benchmarks (especially phpbench.com), is even though the numbers are sound, the tests are not. Alot of the tests on phpbench.com are doing things at are trivial and abuse PHP's ability to cache array lookups to skew benchmarks or in the case of iterating over an array doesn't actually test it in real world cases (no one writes empty for loops). I've done my own benchmarks that I've found are fairly reflective of the real world results and they always show the language's native iterating syntax foreach coming out on top (surprise, surprise).

//make a nicely random array
$aHash1 = range( 0, 999999 );
$aHash2 = range( 0, 999999 );
shuffle( $aHash1 );
shuffle( $aHash2 );
$aHash = array_combine( $aHash1, $aHash2 );


$start1 = microtime(true);
foreach($aHash as $key=>$val) $aHash[$key]++;
$end1 = microtime(true);

$start2 = microtime(true);
while(list($key) = each($aHash)) $aHash[$key]++;
$end2 = microtime(true);


$start3 = microtime(true);
$key = array_keys($aHash);
$size = sizeOf($key);
for ($i=0; $i<$size; $i++) $aHash[$key[$i]]++;
$end3 = microtime(true);

$start4 = microtime(true);
foreach($aHash as &$val) $val++;
$end4 = microtime(true);

echo "foreach ".($end1 - $start1)."\n"; //foreach 0.947947025299
echo "while ".($end2 - $start2)."\n"; //while 0.847212076187
echo "for ".($end3 - $start3)."\n"; //for 0.439476966858
echo "foreach ref ".($end4 - $start4)."\n"; //foreach ref 0.0886030197144

//For these tests we MUST do an array lookup,
//since that is normally the *point* of iteration
//i'm also calling noop on it so that PHP doesn't
//optimize out the loopup.
function noop( $value ) {}

//Create an array of increasing indexes, w/ random values
$bHash = range( 0, 999999 );
shuffle( $bHash );

$bstart1 = microtime(true);
for($i = 0; $i < 1000000; ++$i) noop( $bHash[$i] );
$bend1 = microtime(true);

$bstart2 = microtime(true);
$i = 0; while($i < 1000000) { noop( $bHash[$i] ); ++$i; }
$bend2 = microtime(true);


$bstart3 = microtime(true);
foreach( $bHash as $value ) { noop( $value ); }
$bend3 = microtime(true);

echo "for ".($bend1 - $bstart1)."\n"; //for 0.397135972977
echo "while ".($bend2 - $bstart2)."\n"; //while 0.364789962769
echo "foreach ".($bend3 - $bstart3)."\n"; //foreach 0.346374034882

Solution 4

It's 2020 and stuffs had greatly evolved with php 7.4 and opcache.

Here is the OP^ benchmark, ran as unix CLI, without the echo and html parts.

Test ran locally on a regular computer.

php -v

PHP 7.4.6 (cli) (built: May 14 2020 10:02:44) ( NTS )

Modified benchmark script:

<?php 
 ## preperations; just a simple environment state

  $test_iterations = 100;
  $test_arr_size = 1000;

  // a shared function that makes use of the loop; this should
  // ensure no funny business is happening to fool the test
  function test($input)
  {
    //echo '<!-- '.trim($input).' -->';
  }

  // for each test we create a array this should avoid any of the
  // arrays internal representation or optimizations from getting
  // in the way.

  // normal array
  $test_arr1 = array();
  $test_arr2 = array();
  $test_arr3 = array();
  // hash tables
  $test_arr4 = array();
  $test_arr5 = array();

  for ($i = 0; $i < $test_arr_size; ++$i)
  {
    mt_srand();
    $hash = md5(mt_rand());
    $key = substr($hash, 0, 5).$i;

    $test_arr1[$i] = $test_arr2[$i] = $test_arr3[$i] = $test_arr4[$key] = $test_arr5[$key]
      = $hash;
  }

  ## foreach

  $start = microtime(true);
  for ($j = 0; $j < $test_iterations; ++$j)
  {
    foreach ($test_arr1 as $k => $v)
    {
      test($v);
    }
  }
  echo 'foreach '.(microtime(true) - $start)."\n";  

  ## foreach (using reference)

  $start = microtime(true);
  for ($j = 0; $j < $test_iterations; ++$j)
  {
    foreach ($test_arr2 as &$value)
    {
      test($value);
    }
  }
  echo 'foreach (using reference) '.(microtime(true) - $start)."\n";

  ## for

  $start = microtime(true);
  for ($j = 0; $j < $test_iterations; ++$j)
  {
    $size = count($test_arr3);
    for ($i = 0; $i < $size; ++$i)
    {
      test($test_arr3[$i]);
    }
  }
  echo 'for '.(microtime(true) - $start)."\n";  

  ## foreach (hash table)

  $start = microtime(true);
  for ($j = 0; $j < $test_iterations; ++$j)
  {
    foreach ($test_arr4 as $k => $v)
    {
      test($v);
    }
  }
  echo 'foreach (hash table) '.(microtime(true) - $start)."\n";

  ## for (hash table)

  $start = microtime(true);
  for ($j = 0; $j < $test_iterations; ++$j)
  {
    $keys = array_keys($test_arr5);
    $size = sizeOf($test_arr5);
    for ($i = 0; $i < $size; ++$i)
    {
      test($test_arr5[$keys[$i]]);
    }
  }
  echo 'for (hash table) '.(microtime(true) - $start)."\n";

Output:

foreach 0.0032877922058105
foreach (using reference) 0.0029420852661133
for 0.0025191307067871
foreach (hash table) 0.0035080909729004
for (hash table) 0.0061779022216797

As you can see the evolution is insane, about 560 time faster than reported in 2012.

On my machines and servers, following my numerous experiments, basics for loops are the fastest. This is even clearer using nested loops ($i $j $k..)

It is also the most flexible in usage, and has a better readability from my view.

Share:
121,828
srcspider
Author by

srcspider

Updated on March 16, 2022

Comments

  • srcspider
    srcspider about 2 years

    First of all, I understand in 90% of applications the performance difference is completely irrelevant, but I just need to know which is the faster construct. That and...

    The information currently available on them on the net is confusing. A lot of people say foreach is bad, but technically it should be faster since it's suppose to simplify writing a array traversal using iterators. Iterators, which are again suppose to be faster, but in PHP are also apparently dead slow (or is this not a PHP thing?). I'm talking about the array functions: next() prev() reset() etc. well, if they are even functions and not one of those PHP language features that look like functions.

    To narrow this down a little: I'm not interesting in traversing arrays in steps of anything more than 1 (no negative steps either, ie. reverse iteration). I'm also not interested in a traversal to and from arbitrary points, just 0 to length. I also don't see manipulating arrays with more than 1000 keys happening on a regular basis, but I do see a array being traversed multiple times in the logic of a application! Also as for operations, largely only string manipulation and echo'ing.

    Here are few reference sites:
    http://www.phpbench.com/
    http://www.php.lt/benchmark/phpbench.php

    What I hear everywhere:

    • foreach is slow, and thus for/while is faster
    • PHPs foreach copies the array it iterates over; to make it faster you need to use references
    • code like this: $key = array_keys($aHash); $size = sizeOf($key);
      for ($i=0; $i < $size; $i++)
      is faster than a foreach

    Here's my problem. I wrote this test script: http://pastebin.com/1ZgK07US and no matter how many times I run the script, I get something like this:

    foreach 1.1438131332397
    foreach (using reference) 1.2919359207153
    for 1.4262869358063
    foreach (hash table) 1.5696921348572
    for (hash table) 2.4778981208801
    

    In short:

    • foreach is faster than foreach with reference
    • foreach is faster than for
    • foreach is faster than for for a hash table

    Can someone explain?

    1. Am I doing something wrong?
    2. Is PHP foreach reference thing really making a difference? I mean why would it not copy it if you pass by reference?
    3. What's the equivalent iterator code for the foreach statement; I've seen a few on the net but each time I test them the timing is way off; I've also tested a few simple iterator constructs but never seem to get even decent results -- are the array iterators in PHP just awful?
    4. Are there faster ways/methods/constructs to iterate though a array other than FOR/FOREACH (and WHILE)?

    PHP Version 5.3.0


    Edit: Answer With help from people here I was able to piece together the answers to all question. I'll summarize them here:
    1. "Am I doing something wrong?" The consensus seems to be: yes, I can't use echo in benchmarks. Personally, I still don't see how echo is some function with random time of execution or how any other function is somehow any different -- that and the ability of that script to just generate the exact same results of foreach better than everything is hard to explain though just "you're using echo" (well what should I have been using). However, I concede the test should be done with something better; though a ideal compromise does not come to mind.
    2. "Is PHP foreach reference thing really making a difference? I mean why would it not copy it if you pass by reference?" ircmaxell shows that yes it is, further testing seems to prove in most cases reference should be faster -- though given my above snippet of code, most definitely doesn't mean all. I accept the issue is probably too non-intuitive to bother with at such a level and would require something extreme such as decompiling to actually determine which is better for each situation.
    3. "What's the equivalent iterator code for the foreach statement; I've seen a few on the net but each time I test them the timing is way off; I've also tested a few simple iterator constructs but never seem to get even decent results -- are the array iterators in PHP just awful?" ircmaxell provided the answer bellow; though the code might only be valid for PHP version >= 5
    4. "Are there faster ways/methods/constructs to iterate though a array other than FOR/FOREACH (and WHILE)?" Thanks go to Gordon for the answer. Using new data types in PHP5 should give either a performance boost or memory boost (either of which might be desirable depending on your situation). While speed wise a lot of the new types of array don't seem to be better than array(), the splpriorityqueue and splobjectstorage do seem to be substantially faster. Link provided by Gordon: http://matthewturland.com/2010/05/20/new-spl-features-in-php-5-3/

    I'll likely stick to foreach (the non-reference version) for any simple traversal.

    • Mchl
      Mchl over 13 years
      Rule 2.71 of benchmarking: don't echo to benchmark.
    • bcosca
      bcosca over 13 years
      foreach with reference must be bechmarked against for with reference. you have a flawed conclusion there. any use of a reference is obviously going to be slower than that without a reference, even in a do-while loop.
    • Gordon
      Gordon over 13 years
      Since this is for php 5.3, you might also want to consider testing the new Spl Data Types vs Arrays. Or just look here: matthewturland.com/2010/05/20/new-spl-features-in-php-5-3
    • srcspider
      srcspider over 13 years
      @ Mchl: I ran it a few times, and got the same results -- if echo corrupts the benchmark then shouldn't I get completely random results? also I would want to iterate though something and output it so echo is actually really important for me; if foreach is faster when echo'ing then that's a large chunk of code where I should use foreach. @ stillstanding: what I'm hearing is basically along the lines of "reference in foreach makes faster (always), always write with reference", that's why I tested like that -- I'm not really interested in comparison with other reference loops
    • Your Common Sense
      Your Common Sense over 13 years
      these empty questions should naturally be banned. as well as that deceiving phpbench site
  • srcspider
    srcspider over 13 years
    Don't you mean "[Not-planned] optimization is the root of all evil"? ;) Well thing is all of them do the same thing, so it's not so much a optimization as it is: which is "the better standard way to adopt." Also some more questions unanswered: You say because it doesn't have to copy, but isn't the use of reference a overhead too? stillstanding's comment in my question seems to disagree with your assumptions as well. Also, why is the code producing slower for reference there as well. Did the foreach change in 5.3.0 to convert any array() to a object (eg. SplFixedArray)?
  • ircmaxell
    ircmaxell over 13 years
    @srcspider: Edited answer with benchmark code and results showing references are indeed much faster than non-references...
  • Your Common Sense
    Your Common Sense over 13 years
    @srcspider "the better standard way to adopt." performance is not the only criteria to choose what to adopt. especially in such a farfetched case. Frankly, you're just wasting your time
  • ircmaxell
    ircmaxell over 13 years
    @Col. Shrapnel I agree 100%. Readability and maintainability trump performance by a large margin in this particular case... I agree about picking a standard and sticking with it, but base that standard upon other --more important-- factors...
  • srcspider
    srcspider over 13 years
    @ircmaxell: quickly running your script seems to prove your point but I want to look into it a little further; I might edit my original question with more teststo including some of the new 5.3 features. @Col. Shrapnel: FOR is almost universal programing-kindergardn level, FOREACH is simpler syntax. As far as readability they seem to be on equal ground. This is all so low level I don't think maintenance is a issue as would be for some high level pattern. And I don't think I'm wasting my time since this "basic construct" would account for a lot of code I would write. :)
  • srcspider
    srcspider over 13 years
    @ircmaxell: I've done some more testing, but mostly it was inconclusive. Thanks you for answering a good chunk of the questions. I've edited the first post, to contain the answers to the other questions as well.
  • Loko
    Loko about 9 years
    Nice answer but I was wondering something. I used this before: $count_array=count($array); for ($i = 0; $i < $count_array; $i++) and I was wondering if it's really that bad? Is it like extremely bad to do this at my job or something?
  • hardsetting
    hardsetting over 8 years
    This is exactly the explanation I was looking for, thanks.
  • tobynew
    tobynew over 8 years
    @loko (i know, old post but for others benefit) thats fine, as count isnt iterated over on every loop. Personally i use: for($i=0, $max = count($array); $i<$max; $i++) as count is only called once, and asigned to $max. This almost always runs faster than a foreach in my experiance. it also seems to use less memory (no benchmarks though)
  • doz87
    doz87 over 7 years
    This answer should really be a suppliment or summary to the marked answer. I'm glad I read it, good work.
  • suther
    suther over 7 years
    This "performance hint" isn't true for php7 (even if it seems to be true for php5.x). Most of the time, the "reference-version" is slower than the default one!!! See this: 3v4l.org/110a2#output
  • Sebastien F.
    Sebastien F. over 6 years
    Everybody has an opinion, people come to Stack Overflow to find answers. If you're not sure of what you state, check the source code, the documentation, do a google search, etc.
  • Marwan Salim
    Marwan Salim over 5 years
    Since performance is based on research and testing you should come out with some evidence. Please provide your references accordingly. Hopefully you can improve your answer.
  • Alexander Behling
    Alexander Behling about 4 years
    I think it also depends on the actual load of the server and what you want to do in the loop. I think it also depends on the actual load of the server and what you want to do in the loop. I wanted to know if iterating over numbered array I should better use a foreach - or for-loop ,so I ran a benchmark on sandbox.onlinephpfunctions.com with PHP 7.4. I repeatly run the same script multiplle times and every run gives me different result. One time the for-loop was faster another time the foreach-loop and another time they was equal.
  • Sandun Perera
    Sandun Perera over 3 years
    I was searching for this. thank you for your answer. I think I agree with your explanation.
  • luenib
    luenib about 3 years
    I tested your code several times with 10000 and 50000 items on v7.3.5; by value always wins. With 100000, by value wins most of the time. Perhaps is a version thing.
  • Evan Byrne
    Evan Byrne over 2 years
    The math here isn't quite right. Differences in hardware and potentially other aspects of the software environment aren't being accounted for which could contribute to the dramatic 560 times faster speedup between benchmarks. For a closer comparison, one thing you could try is running both PHP versions in Docker on the same hardware.
  • Evan Byrne
    Evan Byrne over 2 years
    There's nothing generally invalid about benchmarking relative performance between software versions within Docker. Running a benchmark with completely different hardware 11 years later to compare the relative speed of different versions of PHP is going to lead to incorrect measurements. These are objective facts that have been presented clearly and without insult, so there's no reason to resort to personal attacks.