Avoid shutting down an entire data flow network when one block is faulted

问题

I am using DataFlowEx and am wondering how I can avoid shutting down an entire DataFlow if an exception is thrown.

I have a system where tasks will come in at random times, and I want the network to log failures, abandon that particular task and continue with execution of the others.

In reading the documentation on both TPL and DataFlowEx, specifically things like

It [a faulted block] should decline any further incoming messages. Here

DataflowEx takes a fast-fail approach on exception handling just like TPL Dataflow. When an exception is thrown, the low-level block ends to the Faulted state first. Then the Dataflow instance who is the parent of the failing block gets notified. It will immediately propagate the fatal error: notify its other children to shutdown immediately. After all its children is done/completed, the parent Dataflow also comes to its completion, with the original exception wrapped in the CompletionTask whose status is also Faulted. Here

It almost seems like a block moving on from a failure is not intended...

My flows include a lot of File IO and i am expecting the occasional exception to occur (network volumes going offline during read/write, connection failures, permission issues...)

I don't want the entire pipeline to die.

Here is an example of the code I'm working with:

using Gridsum.DataflowEx;
using System;
using System.IO;
using System.Threading.Tasks.Dataflow;

namespace DataManagementSystem.Data.Pipeline.Actions
{
    class CopyFlow : Dataflow<FileInfo, FileInfo>
    {
        private TransformBlock<FileInfo, FileInfo> Copier;
        private string destination;

        public CopyFlow(string destination) : base(DataflowOptions.Default)
        {
            this.destination = destination;

            Copier = new TransformBlock<FileInfo, FileInfo>(f => Copy(f));

            RegisterChild(Copier);            
        }

        public override ITargetBlock<FileInfo> InputBlock { get { return Copier; } }

        public override ISourceBlock<FileInfo> OutputBlock { get { return Copier; } }

        protected virtual FileInfo Copy(FileInfo file)
        {
            try
            {
                return file.CopyTo(Path.Combine(destination, file.Name));
            }
            catch(Exception ex)
            {
                //Log the exception
                //Abandon this unit of work
                //resume processing subsequent units of work
            }

        }
    }
}

Here is how I'm sending work to the pipeline:

var result = pipeline.ProcessAsync(new[] { file1, file2 }).Result;

回答1:

A block will become faulted if it throws an Exception. If you do not want the pipeline to fail you can either not propagate completion or handle the Exception. Handling the exception can take many forms but it sounds like all you need is a simple retry. You could use a try/catch and implement your own retry loop or use something like Polly. A simple example is shown below.

public BuildPipeline() {
    var waitTime = TimeSpan.FromSeconds(1);
    var retryPolicy = Policy.Handle<IOException>()
                            .WaitAndRetryAsync(3, i => waitTime);
    var fileIOBlock = new ActionBlock<string>(async fileName => await retryPolicy.ExecuteAsync(async () => await FileIOAsync(fileName)));
}

Note: this code was not tested but should get you in the right direction.

Edit

You almost have everything you need. Once you catch the exception and log it you can return null or some other marker that you can filter out of the pipeline to a NullTarget. This code ensures that the NullTarget filtering link is the first link on the Copier so any nulls don't make it to your actual destination.

class CopyFlow : Dataflow<FileInfo, FileInfo> {
    private TransformBlock<FileInfo, FileInfo> Copier;
    private string destination;

    public CopyFlow(string destination) : base(DataflowOptions.Default) {
        this.destination = destination;

        Copier = new TransformBlock<FileInfo, FileInfo>(f => Copy(f));
        Copier.LinkTo(DataflowBlock.NullTarget<FileInfo>(), info => info == null);

        RegisterChild(Copier);
    }

    public override ITargetBlock<FileInfo> InputBlock { get { return Copier; } }

    public override ISourceBlock<FileInfo> OutputBlock { get { return Copier; } }

    protected virtual FileInfo Copy(FileInfo file) {
        try {
            return file.CopyTo(Path.Combine(destination, file.Name));
        } catch(Exception ex) {
            //Log the exception
            //Abandon this unit of work
            //resume processing subsequent units of work
            return null;
        }

    }
}

来源：https://stackoverflow.com/questions/47147341/avoid-shutting-down-an-entire-data-flow-network-when-one-block-is-faulted

标签

tpl-dataflow