How to fill an array efficiently in Powershell

17,977

Solution 1

You can repeat arrays, just as you can do with strings:

$myArray = ,2 * $length

This means »Take the array with the single element 2 and repeat it $length times, yielding a new array.«.

Note that you cannot really use this to create multidimensional arrays because the following:

$some2darray = ,(,2 * 1000) * 1000

will just create 1000 references to the inner array, making them useless for manipulation. In that case you can use a hybrid strategy. I have used

$some2darray = 1..1000 | ForEach-Object { ,(,2 * 1000) }

in the past, but below performance measurements suggest that

$some2darray = foreach ($i in 1..1000) { ,(,2 * 1000) }

would be a much faster way.


Some performance measurements:

Command                                                  Average Time (ms)
-------                                                  -----------------
$a = ,2 * $length                                                 0,135902 # my own
[int[]]$a = [System.Linq.Enumerable]::Repeat(2, $length)           7,15362 # JPBlanc
$a = foreach ($i in 1..$length) { 2 }                             14,54417
[int[]]$a = -split "2 " * $length                                24,867394
$a = for ($i = 0; $i -lt $length; $i++) { 2 }                    45,771122 # Ansgar
$a = 1..$length | %{ 2 }                                         431,70304 # JPBlanc
$a = @(); for ($i = 0; $i -lt $length; $i++) { $a += 2 }       10425,79214 # original code

Taken by running each variant 50 times through Measure-Command, each with the same value for $length, and averaging the results.

Position 3 and 4 are a bit of a surprise, actually. Apparently it's much better to foreach over a range instead of using a normal for loop.


Code to generate above chart:

$length = 16384

$tests = '$a = ,2 * $length',
         '[int[]]$a = [System.Linq.Enumerable]::Repeat(2, $length)',
         '$a = for ($i = 0; $i -lt $length; $i++) { 2 }',
         '$a = foreach ($i in 1..$length) { 2 }',
         '$a = 1..$length | %{ 2 }',
         '$a = @(); for ($i = 0; $i -lt $length; $i++) { $a += 2 }',
         '[int[]]$a = -split "2 " * $length'

$tests | ForEach-Object {
    $cmd = $_
    $timings = 1..50 | ForEach-Object {
        Remove-Variable i,a -ErrorAction Ignore
        [GC]::Collect()
        Measure-Command { Invoke-Expression $cmd }
    }
    [pscustomobject]@{
        Command = $cmd
        'Average Time (ms)' = ($timings | Measure-Object -Average TotalMilliseconds).Average
    }
} | Sort-Object Ave* | Format-Table -AutoSize -Wrap

Solution 2

Avoid appending to an array in a loop. It's copying the existing array to a new array with each iteration. Do this instead:

$MyArray = for ($i=1; $i -le $length; $i++) { 2 }

Solution 3

Using PowerShell 3.0 you can use (need .NET Framework 3.5 or upper):

[int[]]$MyArray = ([System.Linq.Enumerable]::Repeat(2, 65000))

Using PowerShell 2.0

$AnArray = 1..65000 | % {2}

Solution 4

It is not clear what you are trying. I tried looking at your code. But, $myArray +=2 means you are just adding 2 as the element. For example, here is the output from my test code:

$myArray = @()
$length = 4
for ($i=1;$i -le $length; $i++) {
    Write-Host $myArray
    $myArray += 2
}

2
2 2
2 2 2

Why do you need to add 2 as the array element so many times?

If all you want is just fill the same value, try this:

$myArray = 1..$length | % { 2 }
Share:
17,977
nixda
Author by

nixda

Updated on June 06, 2022

Comments

  • nixda
    nixda almost 2 years

    I want to fill up a dynamic array with the same integer value as fast as possible using Powershell.
    The Measure-Command shows that it takes 7 seconds on my system to fill it up.
    My current code (snipped) looks like:

    $myArray = @()
    $length = 16385
    for ($i=1;$i -le $length; $i++) {$myArray += 2}  
    

    (Full code can be seen on gist.github.com or on superuser)

    Consider that $length can change. But for better understanding I chose a fixed length.

    Q: How do I speed up this Powershell code?

  • JPBlanc
    JPBlanc almost 11 years
    He is just filling the array with some value ? the value is '2'
  • Ansgar Wiechers
    Ansgar Wiechers almost 11 years
    Question says he wants to fill the array with the same integer value. His problem is that appending to the array with += is terribly slow.
  • ravikanth
    ravikanth almost 11 years
    Hmm! I understood that. But why? Why even find a better way to do something that is not needed. Anyway, he can use range operator as well.
  • nixda
    nixda almost 11 years
    I already appended the full code as github link just to avoid discussions about Why. If you look at the link, you will see that my powershell executes an Excel command for querying a CSV. And the parameter TextFileColumnDataTypes for that query needs an Array to know what data types the columns should be. A 2 stands for a string column, 1 for general, 9 to skip the entire column and so on. So: Long story short: I need a big array with the integer value 2.
  • nixda
    nixda almost 11 years
    +1 $myArray = 1..16385 | % { 2 } runs in 0,02 seconds. much faster than my 7s :)
  • nixda
    nixda almost 11 years
    +1 $MyArray = for ($i=1; $i -le 16385; $i++) { 2 } runs in 0,05 seconds. much faster than my 7s :)
  • nixda
    nixda almost 11 years
    +1 [int[]]$myArray = ([System.Linq.Enumerable]::Repeat(2, 16385)) runs in 0,03s
  • Ansgar Wiechers
    Ansgar Wiechers almost 11 years
    When I tested 1..$length it was reproducibly slower than for ($i=1; $i -lt $length; $i++). Most likely because it's building a list before passing it into the ForEach-Object loop.
  • Michael Sorens
    Michael Sorens almost 11 years
    +1 Concise, clear, constructive, comprehensive, and reproducible! (well, 4 out of 5 c's...)
  • Lance U. Matthews
    Lance U. Matthews about 4 years
    Though this is faster than the code in the question, what is the purpose of using a 1-value Tuple? That means you have to access the Item1 property to get the value back, plus you're creating an Object to wrap every Int32, which will be a lot of garbage on larger lists. This doesn't hurt so bad because using the obsolete, non-generic ArrayList class means it'd be boxing each Int32 in an Object, anyways. Rewriting as $myArray = New-Object 'Collections.Generic.List[Int32]'; foreach($i in 1..$length) { $myArray.add(2) } I get a 40% speedup and with less characters/complexity, too.
  • Lance U. Matthews
    Lance U. Matthews about 4 years
    Also, each property of a Tuple is read-only, so if you want to change a list value (which is bound to happen because...what good is a list of repeated values that always stay the same?) your only option is to create a new Tuple to replace it.
  • Carsten
    Carsten about 4 years
    even if the tuple-part is not really needed for the above challenge, it is worth to remember this to fill large read-only arrays with multiple columns/items per object. very handy to sort very large arrays/lookup-tables by different columns. without tuples and the need to sort anything.