BlockingCollection with Parallel.For hangs?

房东的猫 提交于 2021-02-08 19:43:20

问题


I'm playing around with BlockingCollection to try to understand them better, but I'm struggling to understand why my code hangs when it finishes processing all my items when I use a Parallel.For

I'm just adding a number to it (producer?):

var blockingCollection = new BlockingCollection<long>();

Task.Factory.StartNew(() =>
{
    while (count <= 10000)
    {
        blockingCollection.Add(count);
        count++;
    }
});

Then I'm trying to process (Consumer?):

Parallel.For(0, 5, x => 
{
    foreach (long value in blockingCollection.GetConsumingEnumerable())
    {
        total[x] += 1;
        Console.WriteLine("Worker {0}: {1}", x, value);
    }
});

But when it completes processing all the numbers, it just hangs there? What am I doing wrong?

Also, when I set my Parallel.For to 5, does it mean it's processing the data on 5 separate thread?


回答1:


As its name implies, operations on BlockingCollection<T> block when they can't do anything, and this includes GetConsumingEnumerable().

The reason for this is that the collection can't tell if your producer is already done, or just busy producing the next item.

What you need to do is to notify the collection that you're done adding items to it by calling CompleteAdding(). For example:

while (count <= 10000)
{
    blockingCollection.Add(count);
    count++;
}

blockingCollection.CompleteAdding();



回答2:


It's a GetConsumingEnumerable method feature.

Enumerating the collection in this way blocks the consumer thread if no items are available or if the collection is empty.

You can read more about it here

Also using Parallel.For(0,5) doesn't guarantee that the data will be processed in 5 separate threads. It depends on Environment.ProcessorCount.




回答3:


Also, when I set my Parallel.For to 5, does it mean it's processing the data on 5 separate thread?

No, quoting from a previous answer in SO(How many threads Parallel.For(Foreach) will create? Default MaxDegreeOfParallelism?):

The default scheduler for Task Parallel Library and PLINQ uses the .NET Framework ThreadPool to queue and execute work. In the .NET Framework 4, the ThreadPool uses the information that is provided by the System.Threading.Tasks.Task type to efficiently support the fine-grained parallelism (short-lived units of work) that parallel tasks and queries often represent.

Put it simply, TPL creates Tasks, not threads. The framework decides how many threads should handle them.



来源:https://stackoverflow.com/questions/35617114/blockingcollection-with-parallel-for-hangs

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!