Disappointing performance with Parallel.For

前端未结

关注

 4  818

I am trying to speed up my calculation times by using Parallel.For. I have an Intel Core i7 Q840 CPU with 8 cores, but I only manage to get a performance ratio of 4

相关标签:

4条回答

终归单人心

2021-02-05 08:34
Using a global variable can introduce significant synchronization problems, even when you are not using locks. When you assign a value to the variable each core will have to get access to the same place in system memory, or wait for the other core to finish before accessing it. You can avoid corruption without locks by using the lighter Interlocked.Add method to add a value to the sum atomically, at the OS level, but you will still get delays due to contention.

The proper way to do this is to update a thread local variable to create the partial sums and add all of them to a single global sum at the end. Parallel.For has an overload that does just this. MSDN even has an example using sumation at How To: Write a Parallel.For Loop that has Thread Local Variables
```
        int[] nums = Enumerable.Range(0, 1000000).ToArray();
        long total = 0;

        // Use type parameter to make subtotal a long, not an int
        Parallel.For<long>(0, nums.Length, () => 0, (j, loop, subtotal) =>
        {
            subtotal += nums[j];
            return subtotal;
        },
            (x) => Interlocked.Add(ref total, x)
        );
```
Each thread updates its own subtotal value and updates the global total using Interlocked.Add when it finishes.
0 讨论(0)
发布评论:

提交评论
- 加载中...
心在旅途

2021-02-05 08:45

Parallel.For and Parallel.ForEach will use a degree of parallelism that it feels is appropriate, balancing the cost to setup and tear down threads and the work it expects each thread will perform. .NET 4.5 made several improvements to performance (including more intelligent decisions on the number of threads to spin up) compared to previous .NET versions.

Note that, even if it were to spin up one thread per core, context switches, false sharing issues, resource locks, and other issues may prevent you from achieving linear scalability (in general, not necessarily with your specific code example).

0 讨论(0)
发布评论:

提交评论
- 加载中...

陌清茗

2021-02-05 08:47

foreach vs parallel for each an example

    for (int i = 0; i < 10; i++)
    {
        int[] array = new int[] { 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29 };
        Stopwatch watch = new Stopwatch();
        watch.Start();
        //Parallel foreach
        Parallel.ForEach(array, line =>
        {
            for (int x = 0; x < 1000000; x++)
            {

            }

        });

        watch.Stop();
        Console.WriteLine("Parallel.ForEach {0}", watch.Elapsed.Milliseconds);
        watch = new Stopwatch();
        //foreach
        watch.Start();
        foreach (int item in array)
        {
            for (int z = 0; z < 10000000; z++)
            {

            }
        }
        watch.Stop();
        Console.WriteLine("ForEach {0}", watch.Elapsed.Milliseconds);

        Console.WriteLine("####");
    }
    Console.ReadKey();

My CPU

Intel® Core™ i7-620M Processor (4M Cache, 2.66 GHz)

0 讨论(0)

太阳男子

2021-02-05 08:48
I think the computation gain is so low because your code is "too easy" to work on other task each iteration - because parallel.for just create new task in each iteration, so this will take time to service them in threads. I will it like this:
```
int[] nums = Enumerable.Range(0, 1000000).ToArray();
long total = 0;

Parallel.ForEach(
    Partitioner.Create(0, nums.Length),
    () => 0,
    (part, loopState, partSum) =>
    {
        for (int i = part.Item1; i < part.Item2; i++)
        {
            partSum += nums[i];
        }
        return partSum;
    },
    (partSum) =>
    {
        Interlocked.Add(ref total, partSum);
    }
);
```
Partitioner will create optimal part of job for each task, there will be less time for service task with threads. If you can, please benchmark this solution and tell us if it get better speed up.
0 讨论(0)
发布评论:

提交评论
- 加载中...