I\'m interested in learning about parallel programming in C#.NET (not like everything there is to know, but the basics and maybe some good-practices), therefore I\'ve decide
This is the reference I use for C# thread: http://www.albahari.com/threading/
As a single PDF: http://www.albahari.com/threading/threading.pdf
For your second approach:
I've worked on some producer/consumer multithreaded apps where each task is some code that loops for ever. An external "initializer" starts a separate thread for each task and initializes an EventWaitHandle for each task. For each task is a global queue that can be used to produce/consume input.
In your case, your external program would add each directory to the queue for Task1, and Set the EventWaitHandler for Task1. Task 1 would "wake up" from its EventWaitHandler, get the count of directories in its queue, and then while the count is greater than 0, get the directory from the queue, scan for all the .jpgs, and add each .jpg location to a second queue, and set the EventWaitHandle for task 2. Task 2 reads its input, processes it, forwards it to a queue for Task 3...
It can be a bit of a pain getting all the locking to work right (I basically lock any access to the queue, even something as simple as getting its count). .NET 4.0 is supposed to have data structures that will automatically support a producer/consumer queue with no locks.
Interesting problem. I came up with two approaches. The first is based on PLinq and the second is based on te Rx Framework.
The first one iterates through the files in parallel. The second one generates asynchronously the files from the directory.
Here is how it looks like in a much simplified version (The first method does require .Net 4.0 since it uses PLinq)
string direcory = "Mydirectory";
var jpegFiles = System.IO.Directory.EnumerateFiles(direcory,"*.jpg");
// -- PLinq --------------------------------------------
jpegFiles
.AsParallel()
.Select(imageFile => new {OldLocation = imageFile, NewLocation = GenerateCopyLocation(imageFile) })
.Do(fileInfo =>
{
if (!File.Exists(fileInfo.NewLocation ) ||
(File.GetCreationTime(fileInfo.NewLocation)) != (File.GetCreationTime(fileInfo.NewLocation)))
File.Copy(fileInfo.OldLocation,fileInfo.NewLocation);
})
.Run();
// -----------------------------------------------------
//-- Rx Framework ---------------------------------------------
var resetEvent = new AutoResetEvent(false);
var doTheWork =
jpegFiles.ToObservable()
.Select(imageFile => new {OldLocation = imageFile, NewLocation = GenerateCopyLocation(imageFile) })
.Subscribe( fileInfo =>
{
if (!File.Exists(fileInfo.NewLocation ) ||
(File.GetCreationTime(fileInfo.NewLocation)) != (File.GetCreationTime(fileInfo.NewLocation)))
File.Copy(fileInfo.OldLocation,fileInfo.NewLocation);
},() => resetEvent.Set());
resetEvent.WaitOne();
doTheWork.Dispose();
// -----------------------------------------------------
There are few a options:
Parallel LINQ: Running Queries On Multi-Core Processors
Task Parallel Library (TPL): Optimize Managed Code For Multi-Core Machines
If you are interested in basic threading primitives and concepts: Threading in C#
[But as @John Knoeller pointed out, the example you gave is likely to be sequential I/O bound]