I would like to run a bunch of async tasks, with a limit on how many tasks may be pending completion at any given time.
Say you have 1000 URLs, and you only want to
As suggested, use TPL Dataflow.
A TransformBlock
You define a MaxDegreeOfParallelism
to limit how many strings can be transformed (i.e., how many urls can be downloaded) in parallel. You then post urls to the block, and when you're done you tell the block you're done adding items and you fetch the responses.
var downloader = new TransformBlock(
url => Download(url),
new ExecutionDataflowBlockOptions { MaxDegreeOfParallelism = 50 }
);
var buffer = new BufferBlock();
downloader.LinkTo(buffer);
foreach(var url in urls)
downloader.Post(url);
//or await downloader.SendAsync(url);
downloader.Complete();
await downloader.Completion;
IList responses;
if (buffer.TryReceiveAll(out responses))
{
//process responses
}
Note: The TransformBlock
buffers both its input and output. Why, then, do we need to link it to a BufferBlock
?
Because the TransformBlock
won't complete until all items (HttpResponse
) have been consumed, and await downloader.Completion
would hang. Instead, we let the downloader
forward all its output to a dedicated buffer block - then we wait for the downloader
to complete, and inspect the buffer block.