Does powershell's parallel foreach use at most 5 thread?

不羁的心 提交于 2019-12-12 14:51:59

问题


The throttlelimit parameter of foreach -parallel can control how many processes are used when executing the script. But I can't have more than 5 processes even if I set throttlelimit greater than 5.

The script is executed in multiple powershell processes. So I check PID inside the script. And then group the PIDs so that I can know how many processes are used to execute the script.

function GetPID() {
    $PID
}

workflow TestWorkflow {
    param($throttlelimit)
    foreach -parallel -throttlelimit $throttlelimit ($i in 1..100) {
        GetPID
    }
}

foreach ($i in 1..8) {
    $pids = TestWorkflow -throttlelimit $i
    $measure = $pids | group | Measure-Object
    $measure.Count
}

The output is

1
2
3
4
5
5
5
5

For $i less or equal to 5, I have $i processes. But for $i greater than 5, I have only 5 processes. Is there any way to increase the number of processes when executing the script?

Edit: In response to @SomeShinyObject's answer, I added another test case. It's a modification of the example given by @SomeShinyObject. I added a function S, which does nothing but sleeping for 10 seconds.

function S($n) {
    $s = Get-Date
    $s.ToString("HH:mm:ss.ffff") + " start sleep " + $n
    sleep 10
    $e = Get-Date
    $e.ToString("HH:mm:ss.ffff") + " end sleep " + $n + " diff " + (($e-$s).TotalMilliseconds)
}

Workflow Throttle-Me {
    [cmdletbinding()]
    param (
        [int]$ThrottleLimit = 10,
        [int]$CollectionLimit = 10
    )

    foreach -parallel -throttlelimit $ThrottleLimit ($n in 1..$CollectionLimit){
        $s = Get-Date
        $s.ToString("HH:mm:ss.ffff") + " start " + $n
        S $n
        $e = Get-Date
        $e.ToString("HH:mm:ss.ffff") + " end " + $n + " diff " + (($e-$s).TotalMilliseconds)
    }
}

Throttle-Me -ThrottleLimit 10 -CollectionLimit 20

And here's the output. I grouped the output by time (number of seconds), and did a little bit reorder within each group to make it clearer. It's pretty obvious that function S are called 5 by 5, although I set throttlelimit to 10 (first we have start sleep 6..10, 10 seconds later, we have start sleep 1..5, and 10 seconds later start sleep 11..15, and 10 seconds later start sleep 16..20).

03:40:29.7147 start 10
03:40:29.7304 start 9
03:40:29.7304 start 8
03:40:29.7459 start 7
03:40:29.7459 start 6
03:40:29.7616 start 5
03:40:29.7772 start 4
03:40:29.7772 start 3
03:40:29.7928 start 2
03:40:29.7928 start 1

03:40:35.3067 start sleep 7
03:40:35.3067 start sleep 8
03:40:35.3067 start sleep 9
03:40:35.3692 start sleep 10
03:40:35.4629 start sleep 6

03:40:45.3292 end sleep 7 diff 10022.5353
03:40:45.3292 end sleep 8 diff 10022.5353
03:40:45.3292 end sleep 9 diff 10022.5353
03:40:45.3761 end sleep 10 diff 10006.8765
03:40:45.4855 end sleep 6 diff 10022.5243
03:40:45.3605 end 9 diff 15630.1005
03:40:45.3917 end 7 diff 15645.7465
03:40:45.3917 end 8 diff 15661.3313
03:40:45.4229 end 10 diff 15708.2274
03:40:45.5323 end 6 diff 15786.3969
03:40:45.4386 start sleep 5
03:40:45.4542 start sleep 4
03:40:45.4542 start sleep 3
03:40:45.4698 start sleep 2
03:40:45.5636 start sleep 1
03:40:45.4698 start 11
03:40:45.4855 start 12
03:40:45.5011 start 13
03:40:45.5167 start 14
03:40:45.6105 start 15

03:40:55.4596 end sleep 3 diff 10005.4374
03:40:55.4596 end sleep 4 diff 10005.4374
03:40:55.4596 end sleep 5 diff 10021.0426
03:40:55.4752 end sleep 2 diff 10005.3992
03:40:55.5690 end sleep 1 diff 10005.3784
03:40:55.4752 end 3 diff 25698.0221
03:40:55.4909 end 5 diff 25729.302
03:40:55.5065 end 4 diff 25729.2523
03:40:55.5221 end 2 diff 25729.2559
03:40:55.6159 end 1 diff 25823.032
03:40:55.5534 start sleep 11
03:40:55.5534 start sleep 12
03:40:55.5690 start sleep 13
03:40:55.5846 start sleep 14
03:40:55.6784 start sleep 15
03:40:55.6002 start 16
03:40:55.6002 start 17
03:40:55.6159 start 18
03:40:55.6326 start 19
03:40:55.7096 start 20

03:41:05.5692 end sleep 11 diff 10015.8226
03:41:05.5692 end sleep 12 diff 10015.8226
03:41:05.5848 end sleep 13 diff 10015.8108
03:41:05.6004 end sleep 14 diff 10015.8128
03:41:05.6942 end sleep 15 diff 10015.8205
03:41:05.5848 end 12 diff 20099.3218
03:41:05.6004 end 11 diff 20130.5719
03:41:05.6161 end 13 diff 20114.9729
03:41:05.6473 end 14 diff 20130.5962
03:41:05.6942 end 15 diff 20083.7506
03:41:05.6317 start sleep 17
03:41:05.6317 start sleep 16
03:41:05.6473 start sleep 18
03:41:05.6629 start sleep 19
03:41:05.7411 start sleep 20

03:41:15.6320 end sleep 16 diff 10000.3608
03:41:15.6320 end sleep 17 diff 10000.3608
03:41:15.6477 end sleep 18 diff 10000.3727
03:41:15.6633 end sleep 19 diff 10000.3709
03:41:15.7414 end sleep 20 diff 10000.3546
03:41:15.6477 end 16 diff 20047.4375
03:41:15.6477 end 17 diff 20047.4375
03:41:15.6633 end 18 diff 20047.4295
03:41:15.7101 end 19 diff 20077.5169
03:41:15.7414 end 20 diff 20031.7909

回答1:


I believe your test is a little flawed. According to this TechNet article, $PID is not an available variable in a Script Workflow.

A better test with an explanation can be found on this site

Basically, the ThrottleLimit is set to a theoretical [Int32]::MaxValue. The poster designed a better test to check parallel processing also (slightly modified):

Workflow Throttle-Me {
    [cmdletbinding()]
    param (
        [int]$ThrottleLimit = 10,
        [int]$CollectionLimit = 10
    )

    foreach -parallel -throttlelimit $ThrottleLimit ($n in 1..$CollectionLimit){
        "Working on $n"
        "{0:hh}:{0:mm}:{0:ss}" -f (Get-Date)
    }
}



回答2:


I can't speak for that specific cmdlet , but as you found PowerShell only uses one thread per PID. The common ways of getting around this are PSJobs and Runspaces. Runspaces are more difficult than PSJobs, but they also have significantly better performance, which makes them the popular choice. The downside is that your entire code has to resolve around them which means an entire rewrite in most cases.

Unfortunately as much as Microsoft pushes PowerShell, it is not made for performance and speed. But it does as good as it can while tied to .NET



来源:https://stackoverflow.com/questions/43647045/does-powershells-parallel-foreach-use-at-most-5-thread

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!