Can Task MaxDegreeOfParallelism can take every time the first n object from my list?

后端 未结 3 1168
执笔经年
执笔经年 2021-01-14 03:31

I am opening n concurrent threads in my function:

List _files = new List();

public void Start()
{
    Cancellation         


        
相关标签:
3条回答
  • 2021-01-14 03:34

    Just don't rely on Parallel.ForEach if it's important that work items be started in a particular order; as others have said, you can configure it as needed, but it's not easy.

    The much easier option is to just create 5 different tasks that will process the items. It doesn't have the ability to dynamically add/remove workers as needed, but you appear to not be leveraging that very heavily anyway.

    Just create a BlockingCollection and 5 tasks that take items from it:

    var queue = new BlockingCollection<string>();
    int workers = 5;
    CancellationTokenSource cts = new CancellationTokenSource();
    var tasks = new List<Task>();
    
    for (int i = 0; i < workers; i++)
    {
        tasks.Add(Task.Run(() =>
        {
            foreach (var item in queue.GetConsumingEnumerable())
            {
                cts.Token.ThrowIfCancellationRequested();
    
                DoWork(item);
            }
        }, cts.Token));
    }
    
    //throw this into a new task if adding the items will take too long
    foreach (var item in data)
        queue.Add(item);
    queue.CompleteAdding();
    
    Task.WhenAll(tasks).ContinueWith(t =>
    {
        //do completion stuff
    });
    
    0 讨论(0)
  • 2021-01-14 03:35

    Of course the files are randomly selected, that's the whole point of parallel.foreach. If you go parallel, the 5 threads you specified will use the input as it's decided by the data partionier.

    But if you really want to maintain the order, check the OrderablePartitioner you can specify for the parallel.foreach. -> http://msdn.microsoft.com/en-us/library/dd989583.aspx But of course this will decrease performance, but it allows you to specify how the partitions are created for the threads.

    0 讨论(0)
  • 2021-01-14 03:56

    To get the behavior you want you need to write a custom partitioner, The reason it looks "random" is right now it is batching out the file list in blocks so if your source list was

    List<string> files = List<string> { "a", "b", "c", "d", "e", "f", "g", "h", "i" };
    

    when it partitions it it may split it evenly like so (if Max was 3 threads):

    • Thread1's work list: "a", "b", "c"
    • Thread2's work list: "d", "e", "f"
    • Thread3's work list: "g", "h", "i"

    So if you watched the files being processed it may look like

    "a", "d", "g", "e", "b", "h", "c", "f", "i"
    

    If you make a custom partitioner you can have it take one item at a time instead of a batch at a time to make the work list look like

    • Thread1's work list: "a", GetTheNextUnprocessedString()
    • Thread2's work list: "b", GetTheNextUnprocessedString()
    • Thread3's work list: "c", GetTheNextUnprocessedString()

    If you are using .NET 4.5 you can use this factory like so:

    Parallel.ForEach(Partitioner.Create(_files, EnumerablePartitionerOptions.NoBuffering),
                    new ParallelOptions
                    {
                        MaxDegreeOfParallelism = 5 //limit number of parallel threads 
                    },
                    (file, loopstate, index) =>
                    {
                        if (token.IsCancellationRequested)
                            return;
                        //do work...
                    });
    

    If you are not using .NET 4.5, it is not a trivial task so I am not going to write it here for you. Read the MSDN article I linked at the top and you will be able to figure it out eventually.

    What I would do is ask yourself "do I really need the files to be processed in order?" if you don't need them to be in order let it do its own ordering as the only thing you will likely do by enforcing a order is potentially slowing down the process.

    0 讨论(0)
提交回复
热议问题