Parallel Linq - Use more threads than processors (for non-CPU bound tasks)

前端 未结 4 755
北荒
北荒 2021-01-04 13:36

I\'m using parallel linq, and I\'m trying to download many urls concurrently using essentily code like this:

int threads = 10;
Dictionary

        
4条回答
  •  执笔经年
    2021-01-04 13:39

    Do the URLs refer to the same server? If so, it could be that you are hitting the HTTP connection limit instead of the threading limit. There's an easy way to tell - change your code to:

    int threads = 10;
    Dictionary results = urls.AsParallel(threads)
        .ToDictionary(url => url, 
                      url => {
                          Console.WriteLine("On thread {0}",
                                            Thread.CurrentThread.ManagedThreadId);
                          return GetPage(url);
                      });
    

    EDIT: Hmm. I can't get ToDictionary() to parallelise at all with a bit of sample code. It works fine for Select(url => GetPage(url)) but not ToDictionary. Will search around a bit.

    EDIT: Okay, I still can't get ToDictionary to parallelise, but you can work around that. Here's a short but complete program:

    using System;
    using System.Collections.Generic;
    using System.Threading;
    using System.Linq;
    using System.Linq.Parallel;
    
    public class Test
    {
    
        static void Main()
        {
            var urls = Enumerable.Range(0, 100).Select(i => i.ToString());
    
            int threads = 10;
            Dictionary results = urls.AsParallel(threads)
                .Select(url => new { Url=url, Page=GetPage(url) })
                .ToDictionary(x => x.Url, x => x.Page);
        }
    
        static string GetPage(string x)
        {
            Console.WriteLine("On thread {0} getting {1}",
                              Thread.CurrentThread.ManagedThreadId, x);
            Thread.Sleep(2000);
            return x;
        }
    }
    

    So, how many threads does this use? 5. Why? Goodness knows. I've got 2 processors, so that's not it - and we've specified 10 threads, so that's not it. It still uses 5 even if I change GetPage to hammer the CPU.

    If you only need to use this for one particular task - and you don't mind slightly smelly code - you might be best off implementing it yourself, to be honest.

提交回复
热议问题