Right way to do a Parallel.For to compute data from Array

前端 未结 2 463
渐次进展
渐次进展 2021-01-22 05:47

want to: sum x and sum x*x. Where x = line[i]. Because more than one thread wants to read/write to the \"sumAll\" and \"sumAllQ\" I need to lock its access. The problem is that

2条回答
  •  花落未央
    2021-01-22 06:09

    vcjones wondered about whether you would really see any speedup. Well the answer is: it probably depends how many cores you have. The PLinq is slower than a plain loop on my home PC (which is quad core).

    I've come up with an alternative approach which uses a Partitioner to chop the list of numbers up into several sections so you can add up each one separately. There's also some more information about using a Partitioner here.

    Using the Partitioner approach seems a bit faster, at least on my home PC.

    Here's my test program. Note that you must run a release build of this outside any debugger to get the right timings.

    The important method in this code is ViaPartition():

    Result ViaPartition(double[] numbers)
    {
        var result = new Result();
    
        var rangePartitioner = Partitioner.Create(0, numbers.Length);
    
        Parallel.ForEach(rangePartitioner, (range, loopState) =>
        {
            var subtotal = new Result();
    
            for (int i = range.Item1; i < range.Item2; i++)
            {
                double n = numbers[i];
                subtotal.SumAll  += n;
                subtotal.SumAllQ += n*n;
            }
    
            lock (result)
            {
                result.SumAll  += subtotal.SumAll;
                result.SumAllQ += subtotal.SumAllQ;
            }
        });
    
        return result;
    }
    

    My results when I run the full test program (shown below these results) are:

    Result via Linq:      SumAll=49999950000, SumAllQ=3.33332833333439E+15
    Result via loop:      SumAll=49999950000, SumAllQ=3.33332833333439E+15
    Result via partition: SumAll=49999950000, SumAllQ=3.333328333335E+15
    Via Linq took: 00:00:01.1994524
    Via Loop took: 00:00:00.2357107
    Via Partition took: 00:00:00.0756707
    

    (Note the slight differences due to rounding errors.)

    It'd be interesting to see the results from other systems.

    Here's the full test program:

    using System;
    using System.Collections.Concurrent;
    using System.Collections.Generic;
    using System.Diagnostics;
    using System.Linq;
    using System.Threading.Tasks;
    
    namespace Demo
    {
        public class Result
        {
            public double SumAll;
            public double SumAllQ;
    
            public override string ToString()
            {
                return string.Format("SumAll={0}, SumAllQ={1}", SumAll, SumAllQ);
            }
        }
    
        class Program
        {
            void run()
            {
                var numbers = Enumerable.Range(0, 1000000).Select(n => n/10.0).ToArray();
    
                // Prove that the calculation is correct.
                Console.WriteLine("Result via Linq:      " + ViaLinq(numbers));
                Console.WriteLine("Result via loop:      " + ViaLoop(numbers));
                Console.WriteLine("Result via partition: " + ViaPartition(numbers));
    
                int count = 100;
    
                TimeViaLinq(numbers, count);
                TimeViaLoop(numbers, count);
                TimeViaPartition(numbers, count);
            }
    
            void TimeViaLinq(double[] numbers, int count)
            {
                var sw = Stopwatch.StartNew();
    
                for (int i = 0; i < count; ++i)
                    ViaLinq(numbers);
    
                Console.WriteLine("Via Linq took: " + sw.Elapsed);
            }
    
            void TimeViaLoop(double[] numbers, int count)
            {
                var sw = Stopwatch.StartNew();
    
                for (int i = 0; i < count; ++i)
                    ViaLoop(numbers);
    
                Console.WriteLine("Via Loop took: " + sw.Elapsed);
            }
    
            void TimeViaPartition(double[] numbers, int count)
            {
                var sw = Stopwatch.StartNew();
    
                for (int i = 0; i < count; ++i)
                    ViaPartition(numbers);
    
                Console.WriteLine("Via Partition took: " + sw.Elapsed);
            }
    
            Result ViaLinq(double[] numbers)
            {
                return numbers.AsParallel().Aggregate(new Result(), (input, value) => new Result
                {
                    SumAll  = input.SumAll+value,
                    SumAllQ = input.SumAllQ+value*value
                });
            }
    
            Result ViaLoop(double[] numbers)
            {
                var result = new Result();
    
                for (int i = 0; i < numbers.Length; ++i)
                {
                    double n = numbers[i];
                    result.SumAll  += n;
                    result.SumAllQ += n*n;
                }
    
                return result;
            }
    
            Result ViaPartition(double[] numbers)
            {
                var result = new Result();
    
                var rangePartitioner = Partitioner.Create(0, numbers.Length);
    
                Parallel.ForEach(rangePartitioner, (range, loopState) =>
                {
                    var subtotal = new Result();
    
                    for (int i = range.Item1; i < range.Item2; i++)
                    {
                        double n = numbers[i];
                        subtotal.SumAll  += n;
                        subtotal.SumAllQ += n*n;
                    }
    
                    lock (result)
                    {
                        result.SumAll  += subtotal.SumAll;
                        result.SumAllQ += subtotal.SumAllQ;
                    }
                });
    
                return result;
            }
    
            static void Main()
            {
                new Program().run();
            }
        }
    }
    

提交回复
热议问题