Multi threaded file processing with .NET

后端未结

关注

 6  538

栀梦 2021-01-30 15:21

There is a folder that contains 1000s of small text files. I aim to parse and process all of them while more files are being populated into the folder. My intention is to multit

6条回答

醉酒成梦 (楼主)

2021-01-30 15:39
Design

The Producer/Consumer pattern will probably be the most useful for this situation. You should create enough threads to maximize the throughput.

Here are some questions about the Producer/Consumer pattern to give you an idea of how it works:
- C# Producer/Consumer pattern
- C# producer/consumer
You should use a blocking queue and the producer should add files to the queue while the consumers process the files from the queue. The blocking queue requires no locking, so it's about the most efficient way to solve your problem.

If you're using .NET 4.0 there are several concurrent collections that you can use out of the box:
- ConcurrentQueue: http://msdn.microsoft.com/en-us/library/dd267265%28v=VS.100%29.aspx
- BlockingCollection: http://msdn.microsoft.com/en-us/library/dd267312%28VS.100%29.aspx
Threading

A single producer thread will probably be the most efficient way to load the files from disk and push them onto the queue; subsequently multiple consumers will be popping items off the queue and they'll process them. I would suggest that you try 2-4 consumer threads per core and take some performance measurements to determine which is most optimal (i.e. the number of threads that provide you with the maximum throughput). I would not recommend the use a ThreadPool for this specific example.

P.S. I don't understand what's the concern with a single point of failure and the use of distributed hash tables? I know DHTs sound like a really cool thing to use, but I would try the conventional methods first unless you have a specific problem in mind that you're trying to solve.
0 讨论(0)

查看其它6个回答
发布评论:

提交评论
- 加载中...

Multi threaded file processing with .NET

Design

Threading