Howto: Parallel.Foreach executes many processes, after each process run a new process (but one at a time)? [closed]

问题

I'm sure someone knows this and I will be very thankful for the answer. I don't know much about delegates and asynch and the like - so please give me a general example of how I could implement.

I have a workflow where I can use Parallel.Foreach to execute a method for many different files at the same time (sweet, grind that processor) - however after that method ends I need to run another method (it generates a report on the previous process), and this second method cannot be run in parallel.

I don't want to wait for all the files in the Parallel.ForEach to finish before generating reports (that's not necessary). But if I start the report generation method as the first method ends then I run into problems. Is there some kind of queue or something? There's gotta be some pretty way of doing it, right?

Thanks

回答1:

I think what Jim G means is:

var lockObj = new object();

Parallel.Foreach(files, file => 
{
    // Processing file
    lock(lockObj)
    {
        // Generate report.
    }
});

回答2:

The second method should be chained as a continuation task.

Within the second method, use a lock or mutex to ensure that it does not run in parallel.

回答3:

The Arbiter.Interleave coordination primitive in the Concurrency and Coordination Runtime (CCR) provides a simple way of achieving the functionality you want. basically you pass it 3 receiver groups 1 for concurrent tasks, 1 for exclusive tasks (not executed in parallel) and 1 for shutting down the entire process. You can find an example of how to use it here

回答4:

Another option is use the producer-consumer model. You have a thread safe blocking collection that you put the finished data in then you have one thread that runs the reports pulling data from that collection.

//Collection to hold the data the processed files generated
var proccesedDataItems = new new BlockingCollection<ResultData>();

//A thread that processes the files
var processReports = new Task(() =>
{
    //Removes items from the collection, if the collection is empty it blocks
    // or if "CompletedAdded" has been called it will reach the "end" of the 
    // collection
    foreach(var processedData in proccesedDataItems.GetConsumingEnumerable())
    {
        BuildReport(processedData);
    }
});
processReports.Start();    

//Generating the data
Parallel.Foreach(files, file => 
{
   var proccesedData = ProcessFile(file)
   proccesedDataItems.Add(processedData);
});

//Let anyone consuming the collection that you can stop waiting for new items.
proccesedDataItems.CompleteAdding();

回答5:

Things like this fit nicely into the model of TPL Dataflow: you create one parallel block that processes the files and then another non-parallel block that generates the report:

var processFileBlock = new TransformBlock<File, Result>(
    file => ProcessFile(file),
    new ExecutionDataflowBlockOptions
    {
        MaxDegreeOfParallelism = DataflowBlockOptions.Unbounded
    });

var generateReportBlock = new ActionBlock<Result>(
    result => GenerateReport(result));

processFileBlock.LinkTo(generateReportBlock);

foreach (var file in files)
    processFileBlock.Post(file);

If you also want to wait until all processing is done, you would need to add some code using Complete() and Completetion.

来源：https://stackoverflow.com/questions/12323940/howto-parallel-foreach-executes-many-processes-after-each-process-run-a-new-pr

标签

task-parallel-library

.net-4.5

parallel.foreach