问题
I'm sure someone knows this and I will be very thankful for the answer. I don't know much about delegates and asynch and the like - so please give me a general example of how I could implement.
I have a workflow where I can use Parallel.Foreach to execute a method for many different files at the same time (sweet, grind that processor) - however after that method ends I need to run another method (it generates a report on the previous process), and this second method cannot be run in parallel.
I don't want to wait for all the files in the Parallel.ForEach to finish before generating reports (that's not necessary). But if I start the report generation method as the first method ends then I run into problems. Is there some kind of queue or something? There's gotta be some pretty way of doing it, right?
Thanks
回答1:
I think what Jim G means is:
var lockObj = new object();
Parallel.Foreach(files, file =>
{
// Processing file
lock(lockObj)
{
// Generate report.
}
});
回答2:
The second method should be chained as a continuation task.
Within the second method, use a lock or mutex to ensure that it does not run in parallel.
回答3:
The Arbiter.Interleave coordination primitive in the Concurrency and Coordination Runtime (CCR) provides a simple way of achieving the functionality you want. basically you pass it 3 receiver groups 1 for concurrent tasks, 1 for exclusive tasks (not executed in parallel) and 1 for shutting down the entire process. You can find an example of how to use it here
回答4:
Another option is use the producer-consumer model. You have a thread safe blocking collection that you put the finished data in then you have one thread that runs the reports pulling data from that collection.
//Collection to hold the data the processed files generated
var proccesedDataItems = new new BlockingCollection<ResultData>();
//A thread that processes the files
var processReports = new Task(() =>
{
//Removes items from the collection, if the collection is empty it blocks
// or if "CompletedAdded" has been called it will reach the "end" of the
// collection
foreach(var processedData in proccesedDataItems.GetConsumingEnumerable())
{
BuildReport(processedData);
}
});
processReports.Start();
//Generating the data
Parallel.Foreach(files, file =>
{
var proccesedData = ProcessFile(file)
proccesedDataItems.Add(processedData);
});
//Let anyone consuming the collection that you can stop waiting for new items.
proccesedDataItems.CompleteAdding();
回答5:
Things like this fit nicely into the model of TPL Dataflow: you create one parallel block that processes the files and then another non-parallel block that generates the report:
var processFileBlock = new TransformBlock<File, Result>(
file => ProcessFile(file),
new ExecutionDataflowBlockOptions
{
MaxDegreeOfParallelism = DataflowBlockOptions.Unbounded
});
var generateReportBlock = new ActionBlock<Result>(
result => GenerateReport(result));
processFileBlock.LinkTo(generateReportBlock);
foreach (var file in files)
processFileBlock.Post(file);
If you also want to wait until all processing is done, you would need to add some code using Complete()
and Completetion
.
来源:https://stackoverflow.com/questions/12323940/howto-parallel-foreach-executes-many-processes-after-each-process-run-a-new-pr