TPL DataFlow propagate completion only when all data has been processed

℡╲_俬逩灬. 提交于 2020-06-29 05:26:14

问题


I have a producer sending data to a BufferBlock and when all data has been read from the source, it calls Complete().

The default behaviour is that when the completion is called, even if the buffer still has messages, it propagates the completion down the pipeline.

Is there a wait to tell a block: Propagate the completion only once your buffer is empty?

When the completion occurs, I get an exception on Receive: InvalidOperationException: 'The source completed without providing data to receive.'

I am currently using:

var bufferBlock = new BufferBlock<string>();

var transformBlock = new TransformBlock<string, string>(s =>
{
    Thread.Sleep(50);

    return s;
});

bufferBlock.LinkTo(transformBlock, new DataflowLinkOptions { PropagateCompletion = true });

foreach (var i in Enumerable.Range(0, 10))
    bufferBlock.Post(i.ToString());

bufferBlock.Complete();

while (!transformBlock.Completion.IsCompleted)
    Console.WriteLine(transformBlock.Receive());

To avoid it I am currently using:

while (bufferBlock.Count > 0)
    await Task.Delay(100);

bufferBlock.Complete();

which does not sound like a really clean solution.

Is it a race condition? I.E. The block flagging as not completed and them completing while I call receive?

I guess I could replace !transformBlock.Completion.IsCompleted with block.OutputAvailableAsync is that right?


回答1:


To await completion of a pipeline, you should await the Completion task of the last block in the pipeline. In this case you should change your code to :

foreach (var i in Enumerable.Range(0, 10))
    bufferBlock.Post(i.ToString());

bufferBlock.Complete();

await transformBlock.Completion;

This is demonstraded in the Completing a pipeline and Waiting for the pipeline to finish paragraphs of Walkthrough: Creating a Dataflow Pipeline

TransformBlock has a buffer already which means anything posted to the input BufferBlock will be sent to the TransformBlock immediatelly. It would be better to use a different block for testing purposes. The walkthrough shows a nice example: One transformBlock to download page contents, another to parse them etc.

Just be careful of various .... unfortunate coding practices like creating a new HttpClient instance each time. The downloader could be changed to :

  var httpClient=new HttpClient();
  var downloadString = new TransformBlock<string, string>(async uri =>
  {
     Console.WriteLine("Downloading '{0}'...", uri);

     return await httpClient.GetStringAsync(uri);
  });



回答2:


Yes, the correct way to retrieve messages from a block manually is by using the OutputAvailableAsync method, in combination with the TryReceive:

while (await transformBlock.OutputAvailableAsync())
{
    while (transformBlock.TryReceive(out var item))
    {
        Console.WriteLine(item);
    }
}
await transformBlock.Completion; // Required to propagate exceptions

The properties BufferBlock.Count, TransformBlock.OutputCount etc are only suitable for monitoring and statistics. Using them for controlling the flow of the data is in most cases an indication of possible race conditions and lurking bugs.



来源:https://stackoverflow.com/questions/51541425/tpl-dataflow-propagate-completion-only-when-all-data-has-been-processed

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!