TPL Dataflow exception in transform block with bounded capacity

问题

I need to construct TPL dataflow pipeline which will process a lot of messages. Because there are many messages I can not simply Post them into infinite queue of the BufferBlock or I will face memory issues. So I want to use BoundedCapacity = 1 option to disable the queue and use MaxDegreeOfParallelism to use parallel task processing since my TransformBlocks could take some time for each message. I also use PropagateCompletion to make all completion and fail to propagate down the pipeline.

But I'm facing the issue with error handling when error happened just right after the first message: calling await SendAsync simply switch my app into infinite waiting.

I've simplified my case to sample console app:

var data_buffer = new BufferBlock<int>(new DataflowBlockOptions
{
    BoundedCapacity = 1
});

var process_block = new ActionBlock<int>(x =>
{
    throw new InvalidOperationException();
}, new ExecutionDataflowBlockOptions
{
    MaxDegreeOfParallelism = 2,
    BoundedCapacity = 1
});

data_buffer.LinkTo(process_block,
    new DataflowLinkOptions { PropagateCompletion = true });

for (var k = 1; k <= 5; k++)
{
    await data_buffer.SendAsync(k);
    Console.WriteLine("Send: {0}", k);
}

data_buffer.Complete();

await process_block.Completion;

回答1:

This is expected behavior. If there's a fault "downstream", the error does not propagate "backwards" up the mesh. The mesh is expecting you to detect that fault (e.g., via process_block.Completion) and resolve it.

If you want to propagate errors backwards, you could have an await or continuation on process_block.Completion that faults the upstream block(s) if the downstream block(s) fault.

Note that this is not the only possible solution; you may want to rebuild that part of the mesh or link the sources to an alternative target. The source block(s) have not faulted, so they can just continue processing with a repaired mesh.

回答2:

Unfortunately there is no built-in way to propagate the completion backwards, by just configuring the blocks. It must be done manually. One approach is to establish a backward propagation link for each forward propagation link. It is fast and easy when you have a small pipeline consisting of 2-3 blocks, but it becomes more cumbersome and error-prone as the pipeline grows longer:

data_buffer.LinkTo(process_block,
    new DataflowLinkOptions { PropagateCompletion = true });
PropagateFailure(process_block, data_buffer); // Propagate backwards

public static async void PropagateFailure(IDataflowBlock block1, IDataflowBlock block2)
{
    try { await block1.Completion.ConfigureAwait(false); } catch { }
    if (block1.Completion.IsFaulted) block2.Fault(block1.Completion.Exception);
}

The same idea with a more integrated API:

public static async void BidirectionalLinkTo<T>(this ISourceBlock<T> source,
    ITargetBlock<T> target)
{
    source.LinkTo(target, new DataflowLinkOptions { PropagateCompletion = true });
    try { await target.Completion.ConfigureAwait(false); } catch { }
    if (target.Completion.IsFaulted) source.Fault(target.Completion.Exception);
}

data_buffer.BidirectionalLinkTo(process_block);

Another approach is to ensure that the whole pipeline will be cancelled in case that any block fails. This can be done by configuring all blocks with a CancellationToken from the same source, and attaching a handler to each block's completion that cancels the source:

var cts = new CancellationTokenSource();

var data_buffer = new BufferBlock<int>(new DataflowBlockOptions
{
    BoundedCapacity = 1,
    CancellationToken = cts.Token
});
//...more blocks configured with the same cts.Token

OnErrorCancel(data_buffer, cts);
OnErrorCancel(process_block, cts);
//...

async void OnErrorCancel(IDataflowBlock block, CancellationTokenSource cts)
{
    try { await block.Completion.ConfigureAwait(false); } catch { }
    if (block.Completion.IsFaulted) cts.Cancel();
}

What makes this solution less appealing is that creating a CancellationTokenSource creates also the obligation to Dispose it, which is not always trivial to do.

来源：https://stackoverflow.com/questions/21603428/tpl-dataflow-exception-in-transform-block-with-bounded-capacity

标签

task-parallel-library

tpl-dataflow