Are the cmdlets in a pipeline executing in parallel?

五迷三道 提交于 2019-12-10 13:57:32

问题


I spotted an interesting statement in "PowerShell Notes for professionals" whitepaper - "In a pipeline series each function runs parallel to the others, like parallel threads":

Is that correct? if "yes", is there a technical documentation that supports this statement?


回答1:


It's kinda true, but not really at all.

What do I mean with that? First, let's get your documentation question out of the way. The following is from paragraph §3.13 of the PowerShell version 3.0 Language Specification:

If a command writes a single object, its successor receives that object and then terminates after writing its own object(s) to its successor. If, however, a command writes multiple objects, they are delivered one at a time to the successor command, which executes once per object. This behavior is called streaming. In stream processing, objects are written along the pipeline as soon as they become available, not when the entire collection has been produced.

When processing a collection, a command can be written such that it can do special processing before the initial element and after the final element.

Now, let's have a brief look at what a cmdlet consists of.


Cmdlets and their building blocks

It may be enticing to think of a cmdlet as just another function, a sequential set of statements to be executed synchronously whenever invoked. This is not correct, however.

A cmdlet, in PowerShell, is an object that implements one of at least 3 methods:

  • BeginProcessing() - run once, when the pipeline starts executing
  • ProcessRecord() - run for every pipeline item received
  • EndProcessing() - run once, after the last pipeline item has been processed

Once a pipeline starts executing, BeginProcessing() is called on every single cmdlet in the pipeline. In this sense, all cmdlets in the pipeline are running "in parallel" - but this design basically allows us to execute the pipeline with a single thread - so actual parallel processing involving multiple threads is not necessary to execute the pipeline as designed.

It's probably more accurate to point out that cmdlets execute concurrently in a pipeline.


Let's try it out!

Since the three methods above maps directly onto the begin, process and end blocks that we can define in an advanced function, it's easy to see the effect of this execution flow.

Let's try and feed 5 objects to a pipeline consisting of three cmdlets reporting their state with Write-Host and see what happens (see code below):

PS C:\> 1..5 |first |second |third |Out-Null

Be aware that PowerShell supports external output buffering control via the -OutBuffer common parameter, and this will influence the execution flow as well:

Hope this made some sense!


Here's the code I wrote for the demonstration above.

The Write-Host output from the below function will change its colour based on which alias we use, so it's a little easier to distinguish in the shell.

function Test-Pipeline {
  param(
    [Parameter(ValueFromPipeline)]
    [psobject[]]$InputObject
  )

  begin {
    $WHSplat = @{
      ForegroundColor = switch($MyInvocation.InvocationName){
        'first' {
          'Green'
        }
        'second' {
          'Yellow'
        }
        'third' {
          'Red'
        }
      }
    }
    Write-Host "Begin $($MyInvocation.InvocationName)" @WHSplat
    $ObjectCount = 0
  }

  process {
    foreach($Object in $InputObject) {
      $ObjectCount += 1
      Write-Host "Processing object #$($ObjectCount) in $($MyInvocation.InvocationName)" @WHSplat
      Write-Output $Object
    }
  }

  end {
    Write-Host "End $($MyInvocation.InvocationName)" @WHSplat
  }

}

Set-Alias -Name first  -Value Test-Pipeline
Set-Alias -Name second -Value Test-Pipeline
Set-Alias -Name third  -Value Test-Pipeline


来源:https://stackoverflow.com/questions/48522415/are-the-cmdlets-in-a-pipeline-executing-in-parallel

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!