问题
The increase assignment operator (+=
) is often used in [PowerShell]
questions and answers at the StackOverflow site to construct a collection objects, e.g.:
$Collection = @()
1..$Size | ForEach-Object {
$Collection += [PSCustomObject]@{Index = $_; Name = "Name$_"}
}
Yet it appears an very inefficient operation.
Is it Ok to generally state that the increase assignment operator (+=
) should be avoided for building an object collection in PowerShell?
回答1:
Yes, the increase assignment operator (+=
) should be avoided for building an object collection.
Apart from the fact that using the +=
operator usually requires more statements (because of the array initialization = @()
) and it encourages to store the whole collection in memory rather then push it intermediately into the pipeline it is also inefficient.
The reason it is inefficient is because every time you use the +=
operator, it will just do:
$Collection = $Collection + $NewObject
Because arrays are immutable in terms of element count, the whole collection will be recreated with every iteration.
The correct PowerShell syntax is:
$Collection = 1..$Size | ForEach-Object {
[PSCustomObject]@{Index = $_; Name = "Name$_"}
}
Note: as with other cmdlets; if there is just one item (iteration), the output will be a scalar and not an array, to force it to an array, you might either us the [Array]
type: [Array]$Collection = 1..$Size | ForEach-Object { ... }
or use the Array subexpression operator @( ): $Collection = @(1..$Size | ForEach-Object { ... })
Where it is recommended to not even store the results in a variable ($a = ...
) but intermediately pass it into the pipeline to save memory, e.g.:
1..$Size | ForEach-Object {
[PSCustomObject]@{Index = $_; Name = "Name$_"}
} | ConvertTo-Csv .\Outfile.csv
see also: Fastest Way to get a uniquely index item from the property of an array
Performance measurement
To show the relation with the collection size and the decrease of performance you might check the following test results:
1..20 | ForEach-Object {
$size = 1000 * $_
$Performance = @{Size = $Size}
$Performance.Pipeline = (Measure-Command {
$Collection = 1..$Size | ForEach-Object {
[PSCustomObject]@{Index = $_; Name = "Name$_"}
}
}).Ticks
$Performance.Increase = (Measure-Command {
$Collection = @()
1..$Size | ForEach-Object {
$Collection += [PSCustomObject]@{Index = $_; Name = "Name$_"}
}
}).Ticks
[pscustomobject]$Performance
} | Format-Table *,@{n='Factor'; e={$_.Increase / $_.Pipeline}; f='0.00'} -AutoSize
Size Increase Pipeline Factor
---- -------- -------- ------
1000 1554066 780590 1.99
2000 4673757 1084784 4.31
3000 10419550 1381980 7.54
4000 14475594 1904888 7.60
5000 23334748 2752994 8.48
6000 39117141 4202091 9.31
7000 52893014 3683966 14.36
8000 64109493 6253385 10.25
9000 88694413 4604167 19.26
10000 104747469 5158362 20.31
11000 126997771 6232390 20.38
12000 148529243 6317454 23.51
13000 190501251 6929375 27.49
14000 209396947 9121921 22.96
15000 244751222 8598125 28.47
16000 286846454 8936873 32.10
17000 323833173 9278078 34.90
18000 376521440 12602889 29.88
19000 422228695 16610650 25.42
20000 475496288 11516165 41.29
Meaning that with a collection size of 20,000
objects using the +=
operator is about 40x
slower than using the PowerShell pipeline for this.
来源:https://stackoverflow.com/questions/60708578/why-should-i-avoid-using-the-increase-assignment-operator-to-create-a-colle