I have a question about implementing pipeline using Dataflow TPL library.
My case is that I have a software that needs to process some tasks concurrently. Processing lo
You could use a TransformManyBlock for processing the albums, linked to an ActionBlock
for processing the photos, so that each album is processed before its photos are processed. For imposing a concurrency limitation that exceeds the boundaries of a single block, you could use either a limited-concurrency TaskScheduler
or a SemaphoreSlim
. The second option is more flexible since it allows to throttle asynchronous operations as well. In your case all the operations are synchronous, so you are free to choose either approach. In both cases you should still configure the MaxDegreeOfParallelism
option of the blocks to the desirable maximum concurrency limit, otherwise —if you make them unbounded— the order of processing will become too random.
Here is an example of the TaskScheduler
approach. It uses the ConcurrentScheduler
property of the ConcurrentExclusiveSchedulerPair class:
var options = new ExecutionDataflowBlockOptions
{
MaxDegreeOfParallelism = 2,
TaskScheduler = new ConcurrentExclusiveSchedulerPair(TaskScheduler.Default,
maxConcurrencyLevel: 2).ConcurrentScheduler
};
var albumsBlock = new TransformManyBlock<Album, Photo>(album =>
{
ProcessAlbum(album);
return album.Photos;
}, options);
var photosBlock = new ActionBlock<Photo>(photo =>
{
ProcessPhoto(photo);
}, options);
albumsBlock.LinkTo(photosBlock);
And here is an example of the SemaphoreSlim
approach. Using the WaitAsync method instead of the Wait
has the advantage that the awaiting for acquiring the semaphore will happen asynchronously, so no ThreadPool
threads are going to be needlessly blocked:
var throttler = new SemaphoreSlim(2);
var albumsBlock = new TransformManyBlock<Album, Photo>(async album =>
{
await throttler.WaitAsync();
try
{
ProcessAlbum(album);
return album.Photos;
}
finally { throttler.Release(); }
}, new ExecutionDataflowBlockOptions { MaxDegreeOfParallelism = 2 });
var photosBlock = new ActionBlock<Photo>(async photo =>
{
await throttler.WaitAsync();
try
{
ProcessPhoto(photo);
}
finally { throttler.Release(); }
}, new ExecutionDataflowBlockOptions { MaxDegreeOfParallelism = 2 });
albumsBlock.LinkTo(photosBlock);
I think I can answer myself. What I did is:
1) I created an interface IProcessor with method Process() 2) wrapped AlbumProcessing and PhotoProcessing with interface IProcessor 3) Created one ActionBlock that takes IProcessor as Input and executes Process method.
4) At the end of processing Album I am adding processing of all photos to ActionBlock.
This fulfills my requiremens in 100%. Maybe someone has some other solution?