Right way to do a Parallel.For to compute data from Array

前端 未结 2 462
渐次进展
渐次进展 2021-01-22 05:47

want to: sum x and sum x*x. Where x = line[i]. Because more than one thread wants to read/write to the \"sumAll\" and \"sumAllQ\" I need to lock its access. The problem is that

相关标签:
2条回答
  • 2021-01-22 06:05

    As suggested in the comments, you can use Aggregate to accomplish this with AsParallel in LINQ. For example:

    using System.Linq;
    
    //A class to hold the results.
    //This can be improved by making it immutable and using a constructor.
    public class Result
    {
        public double SumAll { get; set; }
        public double SumAllQ { get; set; }
    }
    

    And you can use LINQ like so:

    var result = line.AsParallel().Aggregate(new Result(), (input, value) => new Result {SumAll = input.SumAll+value, SumAllQ = input.SumAllQ+value*value});
    

    Or even better:

    var pline = line.AsParallel().WithDegreeOfParallelism(Environment.ProcessorCount);
    var result = new Result { SumAll = pline.Sum(), SumAllQ = pline.Sum(x => x * x) };
    

    AsParallel doesn't give you the ability to directly specify options, but you can use .WithDegreeOfParallelism(), .WithExecutionMode(), or .WithMergeOptions() to give you more control. You may have to use WithDegreeOfParallelism to even get it to run with multiple threads.

    0 讨论(0)
  • 2021-01-22 06:09

    vcjones wondered about whether you would really see any speedup. Well the answer is: it probably depends how many cores you have. The PLinq is slower than a plain loop on my home PC (which is quad core).

    I've come up with an alternative approach which uses a Partitioner to chop the list of numbers up into several sections so you can add up each one separately. There's also some more information about using a Partitioner here.

    Using the Partitioner approach seems a bit faster, at least on my home PC.

    Here's my test program. Note that you must run a release build of this outside any debugger to get the right timings.

    The important method in this code is ViaPartition():

    Result ViaPartition(double[] numbers)
    {
        var result = new Result();
    
        var rangePartitioner = Partitioner.Create(0, numbers.Length);
    
        Parallel.ForEach(rangePartitioner, (range, loopState) =>
        {
            var subtotal = new Result();
    
            for (int i = range.Item1; i < range.Item2; i++)
            {
                double n = numbers[i];
                subtotal.SumAll  += n;
                subtotal.SumAllQ += n*n;
            }
    
            lock (result)
            {
                result.SumAll  += subtotal.SumAll;
                result.SumAllQ += subtotal.SumAllQ;
            }
        });
    
        return result;
    }
    

    My results when I run the full test program (shown below these results) are:

    Result via Linq:      SumAll=49999950000, SumAllQ=3.33332833333439E+15
    Result via loop:      SumAll=49999950000, SumAllQ=3.33332833333439E+15
    Result via partition: SumAll=49999950000, SumAllQ=3.333328333335E+15
    Via Linq took: 00:00:01.1994524
    Via Loop took: 00:00:00.2357107
    Via Partition took: 00:00:00.0756707
    

    (Note the slight differences due to rounding errors.)

    It'd be interesting to see the results from other systems.

    Here's the full test program:

    using System;
    using System.Collections.Concurrent;
    using System.Collections.Generic;
    using System.Diagnostics;
    using System.Linq;
    using System.Threading.Tasks;
    
    namespace Demo
    {
        public class Result
        {
            public double SumAll;
            public double SumAllQ;
    
            public override string ToString()
            {
                return string.Format("SumAll={0}, SumAllQ={1}", SumAll, SumAllQ);
            }
        }
    
        class Program
        {
            void run()
            {
                var numbers = Enumerable.Range(0, 1000000).Select(n => n/10.0).ToArray();
    
                // Prove that the calculation is correct.
                Console.WriteLine("Result via Linq:      " + ViaLinq(numbers));
                Console.WriteLine("Result via loop:      " + ViaLoop(numbers));
                Console.WriteLine("Result via partition: " + ViaPartition(numbers));
    
                int count = 100;
    
                TimeViaLinq(numbers, count);
                TimeViaLoop(numbers, count);
                TimeViaPartition(numbers, count);
            }
    
            void TimeViaLinq(double[] numbers, int count)
            {
                var sw = Stopwatch.StartNew();
    
                for (int i = 0; i < count; ++i)
                    ViaLinq(numbers);
    
                Console.WriteLine("Via Linq took: " + sw.Elapsed);
            }
    
            void TimeViaLoop(double[] numbers, int count)
            {
                var sw = Stopwatch.StartNew();
    
                for (int i = 0; i < count; ++i)
                    ViaLoop(numbers);
    
                Console.WriteLine("Via Loop took: " + sw.Elapsed);
            }
    
            void TimeViaPartition(double[] numbers, int count)
            {
                var sw = Stopwatch.StartNew();
    
                for (int i = 0; i < count; ++i)
                    ViaPartition(numbers);
    
                Console.WriteLine("Via Partition took: " + sw.Elapsed);
            }
    
            Result ViaLinq(double[] numbers)
            {
                return numbers.AsParallel().Aggregate(new Result(), (input, value) => new Result
                {
                    SumAll  = input.SumAll+value,
                    SumAllQ = input.SumAllQ+value*value
                });
            }
    
            Result ViaLoop(double[] numbers)
            {
                var result = new Result();
    
                for (int i = 0; i < numbers.Length; ++i)
                {
                    double n = numbers[i];
                    result.SumAll  += n;
                    result.SumAllQ += n*n;
                }
    
                return result;
            }
    
            Result ViaPartition(double[] numbers)
            {
                var result = new Result();
    
                var rangePartitioner = Partitioner.Create(0, numbers.Length);
    
                Parallel.ForEach(rangePartitioner, (range, loopState) =>
                {
                    var subtotal = new Result();
    
                    for (int i = range.Item1; i < range.Item2; i++)
                    {
                        double n = numbers[i];
                        subtotal.SumAll  += n;
                        subtotal.SumAllQ += n*n;
                    }
    
                    lock (result)
                    {
                        result.SumAll  += subtotal.SumAll;
                        result.SumAllQ += subtotal.SumAllQ;
                    }
                });
    
                return result;
            }
    
            static void Main()
            {
                new Program().run();
            }
        }
    }
    
    0 讨论(0)
提交回复
热议问题