How to fill an array efficiently in Powershell
Solution 1
You can repeat arrays, just as you can do with strings:
$myArray = ,2 * $length
This means »Take the array with the single element 2
and repeat it $length
times, yielding a new array.«.
Note that you cannot really use this to create multidimensional arrays because the following:
$some2darray = ,(,2 * 1000) * 1000
will just create 1000 references to the inner array, making them useless for manipulation. In that case you can use a hybrid strategy. I have used
$some2darray = 1..1000 | ForEach-Object { ,(,2 * 1000) }
in the past, but below performance measurements suggest that
$some2darray = foreach ($i in 1..1000) { ,(,2 * 1000) }
would be a much faster way.
Some performance measurements:
Command Average Time (ms)
------- -----------------
$a = ,2 * $length 0,135902 # my own
[int[]]$a = [System.Linq.Enumerable]::Repeat(2, $length) 7,15362 # JPBlanc
$a = foreach ($i in 1..$length) { 2 } 14,54417
[int[]]$a = -split "2 " * $length 24,867394
$a = for ($i = 0; $i -lt $length; $i++) { 2 } 45,771122 # Ansgar
$a = 1..$length | %{ 2 } 431,70304 # JPBlanc
$a = @(); for ($i = 0; $i -lt $length; $i++) { $a += 2 } 10425,79214 # original code
Taken by running each variant 50 times through Measure-Command
, each with the same value for $length
, and averaging the results.
Position 3 and 4 are a bit of a surprise, actually. Apparently it's much better to foreach
over a range instead of using a normal for
loop.
Code to generate above chart:
$length = 16384
$tests = '$a = ,2 * $length',
'[int[]]$a = [System.Linq.Enumerable]::Repeat(2, $length)',
'$a = for ($i = 0; $i -lt $length; $i++) { 2 }',
'$a = foreach ($i in 1..$length) { 2 }',
'$a = 1..$length | %{ 2 }',
'$a = @(); for ($i = 0; $i -lt $length; $i++) { $a += 2 }',
'[int[]]$a = -split "2 " * $length'
$tests | ForEach-Object {
$cmd = $_
$timings = 1..50 | ForEach-Object {
Remove-Variable i,a -ErrorAction Ignore
[GC]::Collect()
Measure-Command { Invoke-Expression $cmd }
}
[pscustomobject]@{
Command = $cmd
'Average Time (ms)' = ($timings | Measure-Object -Average TotalMilliseconds).Average
}
} | Sort-Object Ave* | Format-Table -AutoSize -Wrap
Solution 2
Avoid appending to an array in a loop. It's copying the existing array to a new array with each iteration. Do this instead:
$MyArray = for ($i=1; $i -le $length; $i++) { 2 }
Solution 3
Using PowerShell 3.0 you can use (need .NET Framework 3.5 or upper):
[int[]]$MyArray = ([System.Linq.Enumerable]::Repeat(2, 65000))
Using PowerShell 2.0
$AnArray = 1..65000 | % {2}
Solution 4
It is not clear what you are trying. I tried looking at your code. But, $myArray +=2
means you are just adding 2 as the element. For example, here is the output from my test code:
$myArray = @()
$length = 4
for ($i=1;$i -le $length; $i++) {
Write-Host $myArray
$myArray += 2
}
2
2 2
2 2 2
Why do you need to add 2 as the array element so many times?
If all you want is just fill the same value, try this:
$myArray = 1..$length | % { 2 }
nixda
Updated on June 06, 2022Comments
-
nixda almost 2 years
I want to fill up a dynamic array with the same integer value as fast as possible using Powershell.
The Measure-Command shows that it takes 7 seconds on my system to fill it up.
My current code (snipped) looks like:$myArray = @() $length = 16385 for ($i=1;$i -le $length; $i++) {$myArray += 2}
(Full code can be seen on gist.github.com or on superuser)
Consider that
$length
can change. But for better understanding I chose a fixed length.Q: How do I speed up this Powershell code?
-
JPBlanc almost 11 yearsHe is just filling the array with some value ? the value is '2'
-
Ansgar Wiechers almost 11 yearsQuestion says he wants to fill the array with the same integer value. His problem is that appending to the array with
+=
is terribly slow. -
ravikanth almost 11 yearsHmm! I understood that. But why? Why even find a better way to do something that is not needed. Anyway, he can use range operator as well.
-
nixda almost 11 yearsI already appended the full code as github link just to avoid discussions about Why. If you look at the link, you will see that my powershell executes an Excel command for querying a CSV. And the parameter
TextFileColumnDataTypes
for that query needs an Array to know what data types the columns should be. A2
stands for a string column,1
for general,9
to skip the entire column and so on. So: Long story short: I need a big array with the integer value 2. -
nixda almost 11 years+1
$myArray = 1..16385 | % { 2 }
runs in 0,02 seconds. much faster than my 7s :) -
nixda almost 11 years+1
$MyArray = for ($i=1; $i -le 16385; $i++) { 2 }
runs in 0,05 seconds. much faster than my 7s :) -
nixda almost 11 years+1
[int[]]$myArray = ([System.Linq.Enumerable]::Repeat(2, 16385))
runs in 0,03s -
Ansgar Wiechers almost 11 yearsWhen I tested
1..$length
it was reproducibly slower thanfor ($i=1; $i -lt $length; $i++)
. Most likely because it's building a list before passing it into theForEach-Object
loop. -
Michael Sorens almost 11 years+1 Concise, clear, constructive, comprehensive, and reproducible! (well, 4 out of 5 c's...)
-
Lance U. Matthews about 4 yearsThough this is faster than the code in the question, what is the purpose of using a 1-value
Tuple
? That means you have to access theItem1
property to get the value back, plus you're creating anObject
to wrap everyInt32
, which will be a lot of garbage on larger lists. This doesn't hurt so bad because using the obsolete, non-genericArrayList
class means it'd be boxing eachInt32
in anObject
, anyways. Rewriting as$myArray = New-Object 'Collections.Generic.List[Int32]'; foreach($i in 1..$length) { $myArray.add(2) }
I get a 40% speedup and with less characters/complexity, too. -
Lance U. Matthews about 4 yearsAlso, each property of a
Tuple
is read-only, so if you want to change a list value (which is bound to happen because...what good is a list of repeated values that always stay the same?) your only option is to create a newTuple
to replace it. -
Carsten about 4 yearseven if the tuple-part is not really needed for the above challenge, it is worth to remember this to fill large read-only arrays with multiple columns/items per object. very handy to sort very large arrays/lookup-tables by different columns. without tuples and the need to sort anything.