Quicker (quickest?) way to get number of files in a directory with over 200,000 files

前端 未结 10 1445
Happy的楠姐
Happy的楠姐 2021-02-04 07:35

I have some directories containing test data, typically over 200,000 small (~4k) files per directory.

I am using the following C# code to get the number of files in a di

10条回答
  •  醉梦人生
    2021-02-04 08:04

    Not using the System.IO.Directory namespace, there isn't. You'll have to find a way of querying the directory that doesn't involve creating a massive list of files.

    This seems like a bit of an oversight from Microsoft, the Win32 APIs have always had functions that could count files in a directory.

    You may also want to consider splitting up your directory. How you manage a 200,000-file directory is beyond me :-)

    Update:

    John Saunders raises a good point in the comments. We already know that (general purpose) file systems are not well equipped to handle this level of storage. One thing that is equipped to handle huge numbers of small "files" is a database.

    If you can identify a key for each (containing, for example, date, hour and customer number), these files should be injected into a database. The 4K record size and 108 million rows (200,000 rows/day * 30 days/month * 18 months) should be easily handled by most professional databases. I know that DB2/z would chew on that for breakfast.

    Then, when you need some test data extracted to files, you have a script/program which just extracts the relevant records onto the file system. Then run your tests to successful completion and delete the files.

    That should make your specific problem quite easy to do:

    select count(*) from test_files where directory_name = '/SomeDirectory'
    

    assuming you have an index on directory_name, of course.

提交回复
热议问题