I have some directories containing test data, typically over 200,000 small (~4k) files per directory.
I am using the following C# code to get the number of files in a di
If you just need a file count, I found that using 'EnumerateFiles()' is much quicker than using 'GetFiles()':
/*
String[] Files = Directory.GetFiles(sPath, "*.*", SearchOption.AllDirectories);
Int32 nCount = Files.Length;
*/
Int32 nCount = 0;
var MyFiles = Directory.EnumerateFiles(sPath, "*.*", SearchOption.AllDirectories);
foreach (String sFile in MyFiles) nCount++;
Console.WriteLine("File Count: {0}", nCount);
The file system is not designed for this layout. You'll have to reorganize it (to have fewer files per folder) if you want to work on that performance problem.
I had a very similar problem with a directory containing (we think) ~300,000 files.
After messing with lots of methods for speeding up access (all unsuccessful) we solved our access problems by reorganising the directory into something more hierarchical.
We did this by creating directories a-z
, representing the first letter of the file, then sub-directories for each of those, also containing a-z
for the second letter of the file. Then we inserted the files in the related directory
e.g.
gbp32.dat
went in
g/b/gbp32.dat
and re-wrote our file access routines appropriately. This made a massive difference, and it's relatively trivial to do (I think we moved each file using a 10-line Perl script)
If you are not afraid of calling win32 functions it might be worth trying FIndFirstFile then iterating with FindNextFile. This saves over overhead of allocating all those strings just to get a count.