I wrote somthing very similar, a couple of changes I would recommend.
- Use Directory.EnumerateDirectories instead of GetDirectories, it returns immediately with a IEnumerable so you don't need to wait for it to finish reading all of the directories before processing.
- Use ReadLines instead of ReadAllText, this will only load one line in at a time in memory, this will be a big deal if you hit a large file.
- If you are using a new enough version of .NET use Parallel.ForEach, this will allow you to search multiple files at once.
- You may not be able to open the file, you need to check for read permissions or add to the manifest that your program requires administrative privileges (you should still check though)
I was creating a binary search tool, here is some snippets of what I wrote to give you a hand
private void backgroundWorker1_DoWork(object sender, DoWorkEventArgs e)
{
Parallel.ForEach(Directory.EnumerateFiles(_folder, _filter, SearchOption.AllDirectories), Search);
}
//_array contains the binary pattern I am searching for.
private void Search(string filePath)
{
if (Contains(filePath, _array))
{
//filePath points at a match.
}
}
private static bool Contains(string path, byte[] search)
{
//I am doing ReadAllBytes due to the fact that I am doing a binary search not a text search
// There are no "Lines" to seperate out on.
var file = File.ReadAllBytes(path);
var result = Parallel.For(0, file.Length - search.Length, (i, loopState) =>
{
if (file[i] == search[0])
{
byte[] localCache = new byte[search.Length];
Array.Copy(file, i, localCache, 0, search.Length);
if (Enumerable.SequenceEqual(localCache, search))
loopState.Stop();
}
});
return result.IsCompleted == false;
}
This uses two nested parallel loops. This design is terribly inefficient, and could be greatly improved by using the Booyer-Moore search algorithm but I could not find a binary implementation and I did not have the time when I wrote it originally to implement it myself.