Multithreading task to process files in c#

后端 未结 4 1851
你的背包
你的背包 2021-01-18 12:07

I\'ve been reading a lot about threading but can\'t figure out how to find a solution to my issue. First let me introduce the problem. I have files which need to be processe

4条回答
  •  暖寄归人
    2021-01-18 12:44

    You can use Stephen Toub's ForEachAsync extension method to process the files. It allows you to specify how many concurrent threads you want to use, and it is non-blocking so it frees up your main thread to do other processing. Here is the method from the article:

    public static Task ForEachAsync(this IEnumerable source, int dop, Func body)
    {
        return Task.WhenAll(
            from partition in Partitioner.Create(source).GetPartitions(dop)
            select Task.Run(async delegate
            {
                using (partition)
                    while (partition.MoveNext())
                        await body(partition.Current);
            }));
    }
    

    In order to use it I refactored your code slightly. I changed the dictionary to be of type Dictionary> and it basically holds the host as the key and then all the paths as the values. I assumed the file path will contain the host name in it.

       my_dictionary = (from h in hostname
                        from f in file_paths
                        where f.Contains(h)
                        select new { Hostname = h, File = f }).GroupBy(x => x.Hostname)
                        .ToDictionary(x => x.Key, x => x.Select(s => s.File).Distinct().ToList());
    

    I also changed your process_file method to be async as you were using Task.Delay inside it, which you need to await otherwise it doesn't do anything.

    public static async Task process_file(string file_path_in)
    {
        var time_delay_random = new Random();
        Console.WriteLine("Started:{0} ThreadId:{1}", file_path_in, Thread.CurrentThread.ManagedThreadId);
        await Task.Delay(time_delay_random.Next(3000) + 1000);
        Console.WriteLine("Completed:{0} ThreadId:{1}", file_path_in, Thread.CurrentThread.ManagedThreadId);
    }
    

    To use the code, you get the maximum number of threads you want to use and pass that to my_files.my_dictionary.ForEachAsync. You also supply an asynchronous delegate which processes each of the files for a particular host and sequentially awaits each one to be processed.

    public static async Task MainAsync()
    {
        var my_files = new file_prep_obj();
        my_files.get_files();
    
        const int userSuppliedMaxThread = 5;
        var maxThreads = Math.Min(userSuppliedMaxThread, my_files.my_dictionary.Values.Count());
        Console.WriteLine("MaxThreads = " + maxThreads);
    
        foreach (var pair in my_files.my_dictionary)
        {
            foreach (var path in pair.Value)
            {
                Console.WriteLine("Key= {0}, Value={1}", pair.Key, path);   
            }            
        }
    
        await my_files.my_dictionary.ForEachAsync(maxThreads, async (pair) =>
        {
            foreach (var path in pair.Value)
            {
                // serially process each path for a particular host.
                await process_file(path);
            }
        });
    
    }
    
    static void Main(string[] args)
    {
        MainAsync().Wait();
        Console.ReadKey();
    
    }//Close static void Main(string[] args)
    

    Ouput

    MaxThreads = 5
    Key= host1, Value=C:\host1_file1
    Key= host1, Value=C:\host1_file2
    Key= host1, Value=C:\host1_file3
    Key= host2, Value=C:\host2_file1
    Key= host2, Value=C:\host2_file2
    Key= host3, Value=C:\host3_file1
    Key= host4, Value=C:\host4_file1
    Key= host4, Value=C:\host4_file2
    Key= host5, Value=C:\host5_file1
    Key= host6, Value=C:\host6_file1
    Started:C:\host1_file1 ThreadId:10
    Started:C:\host2_file1 ThreadId:12
    Started:C:\host3_file1 ThreadId:13
    Started:C:\host4_file1 ThreadId:11
    Started:C:\host5_file1 ThreadId:10
    Completed:C:\host1_file1 ThreadId:13
    Completed:C:\host2_file1 ThreadId:12
    Started:C:\host1_file2 ThreadId:13
    Started:C:\host2_file2 ThreadId:12
    Completed:C:\host2_file2 ThreadId:11
    Completed:C:\host1_file2 ThreadId:13
    Started:C:\host6_file1 ThreadId:11
    Started:C:\host1_file3 ThreadId:13
    Completed:C:\host5_file1 ThreadId:11
    Completed:C:\host4_file1 ThreadId:12
    Completed:C:\host3_file1 ThreadId:13
    Started:C:\host4_file2 ThreadId:12
    Completed:C:\host1_file3 ThreadId:11
    Completed:C:\host6_file1 ThreadId:13
    Completed:C:\host4_file2 ThreadId:12
    

提交回复
热议问题