Parallelization of CPU bound task continuing with IO bound

前端未结

关注

 3  938

滥情空心 2021-01-20 02:20

I\'m trying to figure out a good way to do parallelization of code that does processing of big datasets and then imports the resulting data into RavenDb.

The data pr

3条回答

一生所求 (楼主)

2021-01-20 03:00
For each batch you are starting a task. This means that your loop completes very quickly. It leaves (number of batches) tasks behind which is not what you wanted. You wanted (number of CPUs).

Solution: Don't start a new task for each batch. The for loop is already parallel.

In response to your comment, here is an improved version:
```
//this runs in parallel
var processedBatches = datasupplier.GetDataItems()
    .Partition(batchSize)
    .AsParallel()
    .WithDegreeOfParallelism(Environment.ProcessorCount)
    .Select(x => ProcessCpuBound(x));

foreach (var batch in processedBatches) {
 PerformIOIntensiveWorkSingleThreadedly(batch); //this runs sequentially
}
```
0 讨论(0)

查看其它3个回答
发布评论:

提交评论
- 加载中...