Why should I avoid using the increase assignment operator (+=) to create a collection

删除回忆录丶 提交于 2020-03-24 00:21:08

问题


The increase assignment operator (+=) is often used in [PowerShell] questions and answers at the StackOverflow site to construct a collection objects, e.g.:

$Collection = @()
1..$Size | ForEach-Object {
    $Collection += [PSCustomObject]@{Index = $_; Name = "Name$_"}
}

Yet it appears an very inefficient operation.

Is it Ok to generally state that the increase assignment operator (+=) should be avoided for building an object collection in PowerShell?


回答1:


Yes, the increase assignment operator (+=) should be avoided for building an object collection.
Apart from the fact that using the += operator usually requires more statements (because of the array initialization = @()) and it encourages to store the whole collection in memory rather then push it intermediately into the pipeline it is also inefficient.

The reason it is inefficient is because every time you use the += operator, it will just do:

$Collection = $Collection + $NewObject

Because arrays are immutable in terms of element count, the whole collection will be recreated with every iteration.

The correct PowerShell syntax is:

$Collection = 1..$Size | ForEach-Object {
    [PSCustomObject]@{Index = $_; Name = "Name$_"}
}

Note: as with other cmdlets; if there is just one item (iteration), the output will be a scalar and not an array, to force it to an array, you might either us the [Array] type: [Array]$Collection = 1..$Size | ForEach-Object { ... } or use the Array subexpression operator @( ): $Collection = @(1..$Size | ForEach-Object { ... })

Where it is recommended to not even store the results in a variable ($a = ...) but intermediately pass it into the pipeline to save memory, e.g.:

1..$Size | ForEach-Object {
    [PSCustomObject]@{Index = $_; Name = "Name$_"}
} | ConvertTo-Csv .\Outfile.csv

see also: Fastest Way to get a uniquely index item from the property of an array

Performance measurement

To show the relation with the collection size and the decrease of performance you might check the following test results:

1..20 | ForEach-Object {
    $size = 1000 * $_
    $Performance = @{Size = $Size}
    $Performance.Pipeline = (Measure-Command {
        $Collection  = 1..$Size | ForEach-Object {
            [PSCustomObject]@{Index = $_; Name = "Name$_"}
        }
    }).Ticks
    $Performance.Increase = (Measure-Command {
        $Collection  = @()
        1..$Size | ForEach-Object {
            $Collection  += [PSCustomObject]@{Index = $_; Name = "Name$_"}
        }
    }).Ticks
    [pscustomobject]$Performance
} | Format-Table *,@{n='Factor'; e={$_.Increase / $_.Pipeline}; f='0.00'} -AutoSize

 Size  Increase Pipeline Factor
 ----  -------- -------- ------
 1000   1554066   780590   1.99
 2000   4673757  1084784   4.31
 3000  10419550  1381980   7.54
 4000  14475594  1904888   7.60
 5000  23334748  2752994   8.48
 6000  39117141  4202091   9.31
 7000  52893014  3683966  14.36
 8000  64109493  6253385  10.25
 9000  88694413  4604167  19.26
10000 104747469  5158362  20.31
11000 126997771  6232390  20.38
12000 148529243  6317454  23.51
13000 190501251  6929375  27.49
14000 209396947  9121921  22.96
15000 244751222  8598125  28.47
16000 286846454  8936873  32.10
17000 323833173  9278078  34.90
18000 376521440 12602889  29.88
19000 422228695 16610650  25.42
20000 475496288 11516165  41.29

Meaning that with a collection size of 20,000 objects using the += operator is about 40x slower than using the PowerShell pipeline for this.



来源:https://stackoverflow.com/questions/60708578/why-should-i-avoid-using-the-increase-assignment-operator-to-create-a-colle

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!