php String Concatenation, Performance

54,034

Solution 1

No, there is no type of stringbuilder class in PHP, since strings are mutable.

That being said, there are different ways of building a string, depending on what you're doing.

echo, for example, will accept comma-separated tokens for output.

// This...
echo 'one', 'two';

// Is the same as this
echo 'one';
echo 'two';

What this means is that you can output a complex string without actually using concatenation, which would be slower

// This...
echo 'one', 'two';

// Is faster than this...
echo 'one' . 'two';

If you need to capture this output in a variable, you can do that with the output buffering functions.

Also, PHP's array performance is really good. If you want to do something like a comma-separated list of values, just use implode()

$values = array( 'one', 'two', 'three' );
$valueList = implode( ', ', $values );

Lastly, make sure you familiarize yourself with PHP's string type and it's different delimiters, and the implications of each.

Solution 2

I was curious about this, so I ran a test. I used the following code:

<?php
ini_set('memory_limit', '1024M');
define ('CORE_PATH', '/Users/foo');
define ('DS', DIRECTORY_SEPARATOR);

$numtests = 1000000;

function test1($numtests)
{
    $CORE_PATH = '/Users/foo';
    $DS = DIRECTORY_SEPARATOR;
    $a = array();

    $startmem = memory_get_usage();
    $a_start = microtime(true);
    for ($i = 0; $i < $numtests; $i++) {
        $a[] = sprintf('%s%sDesktop%sjunk.php', $CORE_PATH, $DS, $DS);
    }
    $a_end = microtime(true);
    $a_mem = memory_get_usage();

    $timeused = $a_end - $a_start;
    $memused = $a_mem - $startmem;

    echo "TEST 1: sprintf()\n";
    echo "TIME: {$timeused}\nMEMORY: $memused\n\n\n";
}

function test2($numtests)
{
    $CORE_PATH = '/Users/shigh';
    $DS = DIRECTORY_SEPARATOR;
    $a = array();

    $startmem = memory_get_usage();
    $a_start = microtime(true);
    for ($i = 0; $i < $numtests; $i++) {
        $a[] = $CORE_PATH . $DS . 'Desktop' . $DS . 'junk.php';
    }
    $a_end = microtime(true);
    $a_mem = memory_get_usage();

    $timeused = $a_end - $a_start;
    $memused = $a_mem - $startmem;

    echo "TEST 2: Concatenation\n";
    echo "TIME: {$timeused}\nMEMORY: $memused\n\n\n";
}

function test3($numtests)
{
    $CORE_PATH = '/Users/shigh';
    $DS = DIRECTORY_SEPARATOR;
    $a = array();

    $startmem = memory_get_usage();
    $a_start = microtime(true);
    for ($i = 0; $i < $numtests; $i++) {
        ob_start();
        echo $CORE_PATH,$DS,'Desktop',$DS,'junk.php';
        $aa = ob_get_contents();
        ob_end_clean();
        $a[] = $aa;
    }
    $a_end = microtime(true);
    $a_mem = memory_get_usage();

    $timeused = $a_end - $a_start;
    $memused = $a_mem - $startmem;

    echo "TEST 3: Buffering Method\n";
    echo "TIME: {$timeused}\nMEMORY: $memused\n\n\n";
}

function test4($numtests)
{
    $CORE_PATH = '/Users/shigh';
    $DS = DIRECTORY_SEPARATOR;
    $a = array();

    $startmem = memory_get_usage();
    $a_start = microtime(true);
    for ($i = 0; $i < $numtests; $i++) {
        $a[] = "{$CORE_PATH}{$DS}Desktop{$DS}junk.php";
    }
    $a_end = microtime(true);
    $a_mem = memory_get_usage();

    $timeused = $a_end - $a_start;
    $memused = $a_mem - $startmem;

    echo "TEST 4: Braced in-line variables\n";
    echo "TIME: {$timeused}\nMEMORY: $memused\n\n\n";
}

function test5($numtests)
{
    $a = array();

    $startmem = memory_get_usage();
    $a_start = microtime(true);
    for ($i = 0; $i < $numtests; $i++) {
        $CORE_PATH = CORE_PATH;
        $DS = DIRECTORY_SEPARATOR;
        $a[] = "{$CORE_PATH}{$DS}Desktop{$DS}junk.php";
    }
    $a_end = microtime(true);
    $a_mem = memory_get_usage();

    $timeused = $a_end - $a_start;
    $memused = $a_mem - $startmem;

    echo "TEST 5: Braced inline variables with loop-level assignments\n";
    echo "TIME: {$timeused}\nMEMORY: $memused\n\n\n";
}

test1($numtests);
test2($numtests);
test3($numtests);
test4($numtests);
test5($numtests);

... And got the following results. Image attached. Clearly, sprintf is the least efficient way to do it, both in terms of time and memory consumption. EDIT: view image in another tab unless you have eagle vision. enter image description here

Solution 3

StringBuilder analog is not needed in PHP.

I made a couple of simple tests:

in PHP:

$iterations = 10000;
$stringToAppend = 'TESTSTR';
$timer = new Timer(); // based on microtime()
$s = '';
for($i = 0; $i < $iterations; $i++)
{
    $s .= ($i . $stringToAppend);
}
$timer->VarDumpCurrentTimerValue();

$timer->Restart();

// Used purlogic's implementation.
// I tried other implementations, but they are not faster
$sb = new StringBuilder(); 

for($i = 0; $i < $iterations; $i++)
{
    $sb->append($i);
    $sb->append($stringToAppend);
}
$ss = $sb->toString();
$timer->VarDumpCurrentTimerValue();

in C# (.NET 4.0):

const int iterations = 10000;
const string stringToAppend = "TESTSTR";
string s = "";
var timer = new Timer(); // based on StopWatch

for(int i = 0; i < iterations; i++)
{
    s += (i + stringToAppend);
}

timer.ShowCurrentTimerValue();

timer.Restart();

var sb = new StringBuilder();

for(int i = 0; i < iterations; i++)
{
    sb.Append(i);
    sb.Append(stringToAppend);
}

string ss = sb.ToString();

timer.ShowCurrentTimerValue();

Results:

10000 iterations:
1) PHP, ordinary concatenation: ~6ms
2) PHP, using StringBuilder: ~5 ms
3) C#, ordinary concatenation: ~520ms
4) C#, using StringBuilder: ~1ms

100000 iterations:
1) PHP, ordinary concatenation: ~63ms
2) PHP, using StringBuilder: ~555ms
3) C#, ordinary concatenation: ~91000ms // !!!
4) C#, using StringBuilder: ~17ms

Solution 4

When you do a timed comparison, the differences are so small that it isn't very relevant. It would make more since to go for the choice that makes your code easier to read and understand.

Solution 5

I know what you're talking about. I just created this simple class to emulate the Java StringBuilder class.

class StringBuilder {

  private $str = array();

  public function __construct() { }

  public function append($str) {
    $this->str[] = $str;
  }

  public function toString() {
    return implode($this->str);
  }

}
Share:
54,034
Chris
Author by

Chris

Updated on July 05, 2022

Comments

  • Chris
    Chris almost 2 years

    In languages like Java and C#, strings are immutable and it can be computationally expensive to build a string one character at a time. In said languages, there are library classes to reduce this cost such as C# System.Text.StringBuilder and Java java.lang.StringBuilder.

    Does php (4 or 5; I'm interested in both) share this limitation? If so, are there similar solutions to the problem available?

  • paan
    paan over 15 years
    people here is quick on the trigger.. i was typing in the dark.. accidentally hit tab then enter..
  • DGM
    DGM over 15 years
    Indeed, worrying about this is just outright silly, when there are usually far more important issues to worry about, like database design, big O() analysis, and proper profiling.
  • user3319401
    user3319401 over 15 years
    does this function work this way? $newstring = str1.srt2.str3; echo $newstring;
  • Pete Alvin
    Pete Alvin over 13 years
    That is very true, but I HAVE seen situations in Java and C# where using a mutable string class (vs. s += "blah") have indeed increased performance dramatically.
  • Jabba
    Jabba over 13 years
    Nice solution. At the end of the append function you can add return $this; to allow method chaining: $sb->append("one")->append("two");.
  • Stephen
    Stephen over 13 years
    And use single-quotes whenever possible.
  • ryeguy
    ryeguy about 13 years
    This is completely unnecessary in PHP. In fact, I'm willing to bet that this is significantly slower than doing regular concatenation.
  • ossys
    ossys almost 13 years
    ryeguy: true, being that strings are mutable in PHP this method is "unnecessary", the person asked for a similar implementation to Java's StringBuilder, so here you go... I wouldn't say it's "significantly" slower, I think you're being a little dramatic. The overhead of instantiating a class that manages the string building may include costs, but the usefulness of the StringBuilder class can be expanded to include additional methods on the string. I'll look into what additional overhead is realized by implementing something like this in a class and try to post back.
  • Wolfgang Adamec
    Wolfgang Adamec about 11 years
    I'm no expert in php. Is "$string .= 'a'" not a short form of "$string = $string . 'a'" and is php not creating a new string (and not changing the old one)?
  • Tebe
    Tebe over 10 years
    why not double quotes?
  • samitny
    samitny over 10 years
    @gekannt Because PHP expands/interprets variables as well as extra escape sequences in strings that are enclosed in double quotes. For example, $x = 5; echo "x = $x"; would print x = 5 while $x = 5; echo 'x = $x'; would print x = $x.
  • Tebe
    Tebe over 10 years
    one can need it to be expanded as well as not to be expanded/interpret, it depends upon the situation
  • Nigralbus
    Nigralbus over 10 years
    ... and he was never heard from again.
  • Raptor
    Raptor over 10 years
    should have 1 more test: similar to test2 but replace . with , (without output buffer, of course)
  • A.Grandt
    A.Grandt over 10 years
    Java is more or less the same as C# in this. Though the later versions have done some optimization at compile time to help alleviate this. It used to be the case (in 1.4 and earlier, maybe even in 1.6) that if you have 3 or more elements to concatenate, you were better off using a StringBuffer/Builder. Though in a loop, you still need to use the StringBuilder.
  • User
    User about 10 years
    Bit of a myth, the single quote thing: nikic.github.io/2012/01/09/…
  • Peter Bailey
    Peter Bailey about 10 years
    Good info, @alimack, but for the record, this answer isn't about single vs double quotes nor is it about concatenation vs interpolation. It's about using echo with parameterized tokens vs concatenated tokens.
  • Chris Middleton
    Chris Middleton over 9 years
    Very useful, thank you. String concatenation appears to be the way to go. It makes sense that they'd try and optimize the hell out of that.
  • thomasrutter
    thomasrutter about 7 years
    In other words, PHP was designed for people who don't want to have to worry about low level considerations and it does string buffering internally on the string type. This is not to do with strings being "mutable" on PHP; growing a string's length still requires a memory copy to a larger piece of memory unless you maintain a buffer for it to grow into.
  • thomasrutter
    thomasrutter about 7 years
    Yes it is a short form. But to your second question, PHP's internal behaviour is such that it's effectively like replacing the string with one that's a byte longer. Internally though, it does buffering like StringBuilder.
  • thomasrutter
    thomasrutter about 7 years
    BTW this should be the accepted answer. The current top answers don't even actually answer the question.
  • Denys Klymenko
    Denys Klymenko over 6 years
    Please use always echo with dot, and double quotes where it increases code readability. Such optimizations are evil
  • Kyouma
    Kyouma over 3 years
    Thank you for this. array_push was 100x faster than concatenating strings in my code.
  • Lucas Bustamante
    Lucas Bustamante about 3 years
    This kind of performance optimization is important when you have to manipulate a string with hundreds of thousand of characters in a while loop that breaks only when PHP gets out of execution time or memory - my case