问题
There is a C# function A(arg1, arg2)
which needs to be called lots of times. To do this fastest, I am using parallel programming.
Take the example of the following code:
long totalCalls = 2000000;
int threads = Environment.ProcessorCount;
ParallelOptions options = new ParallelOptions();
options.MaxDegreeOfParallelism = threads;
Parallel.ForEach(Enumerable.Range(1, threads), options, range =>
{
for (int i = 0; i < total / threads; i++)
{
// init arg1 and arg2
var value = A(arg1, agr2);
// do something with value
}
});
Now the issue is that this is not scaling up with an increase in number of cores; e.g. on 8 cores it is using 80% of CPU and on 16 cores it is using 40-50% of CPU. I want to use the CPU to maximum extent.
You may assume A(arg1, arg2)
internally contains a complex calculation, but it doesn't have any IO or network-bound operations, and also there is no thread locking. What are other possibilities to find out which part of the code is making it not perform in a 100% parallel manner?
I also tried increasing the degree of parallelism, e.g.
int threads = Environment.ProcessorCount * 2;
// AND
int threads = Environment.ProcessorCount * 4;
// etc.
But it was of no help.
Update 1 - if I run the same code by replacing A()
with a simple function which is calculating prime number then it is utilizing 100 CPU and scaling up well. So this proves that other piece of code is correct. Now issue could be within the original function A()
. I need a way to detect that issue which is causing some sort of sequencing.
回答1:
You have determined that the code in A
is the problem.
There is one very common problem: Garbage collection. Configure your application in app.config
to use the concurrent server GC. The Workstation GC tends to serialize execution. The effect is severe.
If this is not the problem pause the debugger a few times and look at the Debug -> Parallel Stacks
window. There, you can see what your threads are doing. Look for common resources and contention. For example if you find many thread waiting for a lock that's your problem.
Another nice debugging technique is commenting out code. Once the scalability limit disappears you know what code caused it.
来源:https://stackoverflow.com/questions/38243432/how-can-i-achieve-maximum-parallelism-and-utilize-maximum-cpu-with-parallel-fore