Powershell Multithreaded math

倾然丶 夕夏残阳落幕 提交于 2019-12-11 10:33:50

问题


I'm currently working on a self-inspired project to learn powershell, and have been writing a script to generate prime numbers. As it stands, the script works without issue, but my next goal is to increase it's processing speed.

cls
$Primes = @()
$Primes += 3
$TargetNum = 5
$PrimesIndex = 0
$NumOfPrime = 3
while(1)
{
    if(($TargetNum / 3) -lt 3) 
    {
        $Primes += $TargetNum
        $TargetNum += 2
        $NumOfPrime += 1        
    }
    else
    {
        if($Primes[$PrimesIndex] -le ($TargetNum / ($Primes[$PrimesIndex]))) 
        {
            if($TargetNum % $Primes[$PrimesIndex] -eq 0)
            {
                $PrimesIndex = 0
                $TargetNum += 2

            }
            else
            {
                $PrimesIndex++
            }
        }
        else
        {
            $PrimesIndex = 0
            $NumOfPrime += 1
            $Primes += $TargetNum
            $TargetNum += 2
            if($TargetNum -gt 100000){write-host $TargetNum ", " $NumOfPrime;break}
        }
    }
}

If I Execute the statement Measure-command {& ".\primes.ps1"} it will calculate the first 100,000 primes in ≈ 9.1 seconds (for me anyway), but this is only performing the calculations using a single CPU thread. I've looked into using start-job and start-processcommands to implement some sort of multi-threading, but I am failing to understand how they work.

If I moved the prime testing calculation to a function, how would I go about calling that function across all 4 of my logical cores? Perhaps creating a second powershell script that I can pass a value to test, and start-process on that? The above script solves an average of 10,000 primes\sec in the first 10 sec, will powershell even be able to start and stop some worker scripts that quickly?


回答1:


There are two terms that must be considered separately: asynchronous and parallel programming. First one provides simple background execution of an arbitrary task, while latter obliges you (as the author of the algorithm) to split your task into several independent tasks to be able to run them on separate calculating units (cores, processors, machines).

You can easily start asynchronous task with your function, but it won't give you parallel calculation:

Start-Job -Name "GetPrimes" -ScriptBlock {MyPrimesFunction} | Wait-Job | Receive-Job

An easy way to achieve parallelism is to split your function into chunks (for example, by several number intervals in which it will search primes) and then run each chunk with Start-Job:

$jobs = @()

# gather all jobs into an array 
$jobs += Start-Job -ScriptBlock {MyPrimesFunction1}
$jobs += Start-Job -ScriptBlock {MyPrimesFunction2}
$jobs += Start-Job -ScriptBlock {MyPrimesFunction3}
$jobs += Start-Job -ScriptBlock {MyPrimesFunction4}

# wait for all jobs
Wait-Job $jobs | Out-Null

# get result arrays from jobs
$results = $jobs | Receive-Job

$primes = @()

# merge results into single array
foreach ($result in $results) {
  $primes += $result
}

Notice that your function must return a result as an array of primes. And you must rewrite your function 4 times, each using different number intervals.

Approach with jobs relies on system process management (cause each job starts separate powershell.exe). Another approach is to use Runspaces. You can read several posts about it.



来源:https://stackoverflow.com/questions/33281288/powershell-multithreaded-math

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!