I have a question about implementing pipeline using Dataflow TPL library.
My case is that I have a software that needs to process some tasks concurrently. Processing lo
You could use a TransformManyBlock for processing the albums, linked to an ActionBlock
for processing the photos, so that each album is processed before its photos are processed. For imposing a concurrency limitation that exceeds the boundaries of a single block, you could use either a limited-concurrency TaskScheduler
or a SemaphoreSlim
. The second option is more flexible since it allows to throttle asynchronous operations as well. In your case all the operations are synchronous, so you are free to choose either approach. In both cases you should still configure the MaxDegreeOfParallelism
option of the blocks to the desirable maximum concurrency limit, otherwise —if you make them unbounded— the order of processing will become too random.
Here is an example of the TaskScheduler
approach. It uses the ConcurrentScheduler
property of the ConcurrentExclusiveSchedulerPair class:
var options = new ExecutionDataflowBlockOptions
{
MaxDegreeOfParallelism = 2,
TaskScheduler = new ConcurrentExclusiveSchedulerPair(TaskScheduler.Default,
maxConcurrencyLevel: 2).ConcurrentScheduler
};
var albumsBlock = new TransformManyBlock(album =>
{
ProcessAlbum(album);
return album.Photos;
}, options);
var photosBlock = new ActionBlock(photo =>
{
ProcessPhoto(photo);
}, options);
albumsBlock.LinkTo(photosBlock);
And here is an example of the SemaphoreSlim
approach. Using the WaitAsync method instead of the Wait
has the advantage that the awaiting for acquiring the semaphore will happen asynchronously, so no ThreadPool
threads are going to be needlessly blocked:
var throttler = new SemaphoreSlim(2);
var albumsBlock = new TransformManyBlock(async album =>
{
await throttler.WaitAsync();
try
{
ProcessAlbum(album);
return album.Photos;
}
finally { throttler.Release(); }
}, new ExecutionDataflowBlockOptions { MaxDegreeOfParallelism = 2 });
var photosBlock = new ActionBlock(async photo =>
{
await throttler.WaitAsync();
try
{
ProcessPhoto(photo);
}
finally { throttler.Release(); }
}, new ExecutionDataflowBlockOptions { MaxDegreeOfParallelism = 2 });
albumsBlock.LinkTo(photosBlock);