问题
I have a producer sending data to a BufferBlock
and when all data has been read from the source, it calls Complete()
.
The default behaviour is that when the completion is called, even if the buffer still has messages, it propagates the completion down the pipeline.
Is there a wait to tell a block: Propagate the completion only once your buffer is empty?
When the completion occurs, I get an exception on Receive
: InvalidOperationException: 'The source completed without providing data to receive.'
I am currently using:
var bufferBlock = new BufferBlock<string>();
var transformBlock = new TransformBlock<string, string>(s =>
{
Thread.Sleep(50);
return s;
});
bufferBlock.LinkTo(transformBlock, new DataflowLinkOptions { PropagateCompletion = true });
foreach (var i in Enumerable.Range(0, 10))
bufferBlock.Post(i.ToString());
bufferBlock.Complete();
while (!transformBlock.Completion.IsCompleted)
Console.WriteLine(transformBlock.Receive());
To avoid it I am currently using:
while (bufferBlock.Count > 0)
await Task.Delay(100);
bufferBlock.Complete();
which does not sound like a really clean solution.
Is it a race condition? I.E. The block flagging as not completed and them completing while I call receive?
I guess I could replace !transformBlock.Completion.IsCompleted
with block.OutputAvailableAsync
is that right?
回答1:
To await completion of a pipeline, you should await the Completion task of the last block in the pipeline. In this case you should change your code to :
foreach (var i in Enumerable.Range(0, 10))
bufferBlock.Post(i.ToString());
bufferBlock.Complete();
await transformBlock.Completion;
This is demonstraded in the Completing a pipeline and Waiting for the pipeline to finish paragraphs of Walkthrough: Creating a Dataflow Pipeline
TransformBlock has a buffer already which means anything posted to the input BufferBlock will be sent to the TransformBlock immediatelly. It would be better to use a different block for testing purposes. The walkthrough shows a nice example: One transformBlock to download page contents, another to parse them etc.
Just be careful of various .... unfortunate coding practices like creating a new HttpClient instance each time. The downloader could be changed to :
var httpClient=new HttpClient();
var downloadString = new TransformBlock<string, string>(async uri =>
{
Console.WriteLine("Downloading '{0}'...", uri);
return await httpClient.GetStringAsync(uri);
});
回答2:
Yes, the correct way to retrieve messages from a block manually is by using the OutputAvailableAsync method, in combination with the TryReceive:
while (await transformBlock.OutputAvailableAsync())
{
while (transformBlock.TryReceive(out var item))
{
Console.WriteLine(item);
}
}
await transformBlock.Completion; // Required to propagate exceptions
The properties BufferBlock.Count
, TransformBlock.OutputCount
etc are only suitable for monitoring and statistics. Using them for controlling the flow of the data is in most cases an indication of possible race conditions and lurking bugs.
来源:https://stackoverflow.com/questions/51541425/tpl-dataflow-propagate-completion-only-when-all-data-has-been-processed