NTFS directory has 100K entries. How much performance boost if spread over 100 subdirectories?

前端 未结 4 1798
长发绾君心
长发绾君心 2021-01-19 00:10

Context We have a homegrown filesystem-backed caching library. We currently have performance problems with one installation due to large number of entries (

4条回答
  •  一生所求
    2021-01-19 00:55

    If you never need to stat or list the cache directory, and only ever stat and open files within it by full path, it should not really matter (at least not at the 100k files level) how many files are in the directory.

    Many caching frameworks and filesystem-heavy storage engines will create subdirectories based on the first character in the filenames in such scenarios, so that if you are storing a file "abcdefgh.png" in your cache, it would go into "cache/a/b/cdefgh.png" instead of just "cache/abcdefgh.png". This assumes that the distributions of the first two letters of your file names are roughly uniform across the character space.

    As you mentioned, since your primary task that involves listing or traversing the directories is in deleting outdated files, I would recommend that you create directories based on the date and/or time the file was cached, i.e. "cache/2010/12/04/22/abcdefgh.png" and, wherever you index the cache, be sure to index it by filename AND date (especially if it's in a database) so that you can quickly remove items by date from the index and remove the corresponding directory.

提交回复
热议问题