Opening many small files on NTFS is way too slow

后端 未结 5 1283
我寻月下人不归
我寻月下人不归 2021-02-14 14:53

I am writing a program that should process many small files, say thousands or even millions. I\'ve been testing that part on 500k files, and the first step was just to iterate a

5条回答
  •  遥遥无期
    2021-02-14 15:40

    You might try doing one pass to enumerate the files to a data structure and then open and close them in a second pass, to see whether interleaving the operations is causing contention.

    As I posted in the comments, there are lots of performance concerns about having huge numbers of entries in a single NTFS directory. So if you have control over how those files are distributed across directories, you might want to take advantage of that.

    Also check for anti-malware on your system. Some will slow down every file access by scanning the entire file each time you try to access it. Using Sysinternals Procmon can help you spot this kind of problem.

    When trying to improve performance, it's a good idea to set a goal. How fast is fast enough?

    EDIT: This part of the original answer doesn't apply unless you're using Windows XP or earlier:

    Opening and closing each file will, by default, update the last-access time in the index. You could try an experiment where you turn that feature off via registry or command line and see how big of a difference it makes. I'm not sure if it's a feasible thing to do in your actual product, since it's a global setting.

提交回复
热议问题