I have a few hundred thousand URLs that I need to call. These are calls to an application server which will process them and write a status code to a table. I do not need to
I agree with the top post to use Runspaces. However the provided code doesn't show how to get data back from the request. Here's a PowerShell module recently published to my GitHub page:
https://github.com/phbits/AsyncHttps.
It will submit async HTTP requests to a single domain over SSL/TLS (TCP port 443). Here's an Example from the README.md
Import-Module AsyncHttps
Invoke-AsyncHttps -DnsName www.contoso.com -UriPaths $('dir1','dir2','dir3')
It returns a System.Object[] containing the results of each request. The result properties are as follows:
Uri - Request Uri
Status - Http Status Code or Exception Message
BeginTime - Job Start Time
EndTime - Job End Time
After looking at your example, you'll probably need to make the following modifications:
webserver:8080
). The easiest would be to update the URI in the scriptblock. Alternatively add another parameter to the module and scriptblock just for the port.UriBuilder
in the scriptblock as long as your list of Uri Paths are known to be OK.You can also use async methods of .net webclients. Say if you just need to send a get request to your Urls, Net.WebClient will work. Below is a dummy example with example.com:
$urllist = 1..97
$batchSize = 20
$results = [System.Collections.ArrayList]::new()
$i = 1
foreach($url in $urllist) {
$w = [System.Net.Webclient]::new().DownloadStringTaskAsync("http://www.example.com?q=$i")
$results.Add($w) | Out-Null
if($i % $batchSize -eq 0 -or $i -eq $urllist.Count) {
While($false -in $results.IsCompleted) {sleep -Milliseconds 300} # waiting for batch to complete
Write-Host " ........ Batch completed ......... $i" -ForegroundColor Green
foreach($r in $results) {
New-Object PSObject -Property @{url = $r.AsyncState.AbsoluteURI; jobstatus =$r.Status; success = !$r.IsFaulted}
# if you need response text use $r.Result
}
$results.Clear()
}
$i+=1
}
With Jobs you incur a large amount of overhead, because each new Job spawns a new process.
Use Runspaces instead!
$maxConcurrentJobs = 10
$content = Get-Content -Path "C:\Temp\urls.txt"
# Create a runspace pool where $maxConcurrentJobs is the
# maximum number of runspaces allowed to run concurrently
$Runspace = [runspacefactory]::CreateRunspacePool(1,$maxConcurrentJobs)
# Open the runspace pool (very important)
$Runspace.Open()
foreach ($url in $content) {
# Create a new PowerShell instance and tell it to execute in our runspace pool
$ps = [powershell]::Create()
$ps.RunspacePool = $Runspace
# Attach some code to it
[void]$ps.AddCommand("Invoke-WebRequest").AddParameter("UseBasicParsing",$true).AddParameter("Uri",$url)
# Begin execution asynchronously (returns immediately)
[void]$ps.BeginInvoke()
# Give feedback on how far we are
Write-Host ("Initiated request for {0}" -f $url)
}
As noted in the linked ServerFault post, you can also use a more generic solution, like Invoke-Parallel
, which basically does the above