How to retrieve a list of directories QUICKLY in Java?

后端 未结 14 933
轮回少年
轮回少年 2020-12-01 12:45

Suppose a very simple program that lists out all the subdirectories of a given directory. Sound simple enough? Except the only way to list all subdirectories in Java is to u

相关标签:
14条回答
  • 2020-12-01 12:46

    In that case you might try some JNA solution - a platform dependant directory traverser (FindFirst, FindNext on Windows) with the possibility of some iteration pattern. Also Java 7 will have much better file system support, worth checking out the specs (I don't remember any specifics).

    Edit: An idea: one option is to hide the slowness of the directory listing from the user's eyes. In a client side app, you could use some animation while the listing is working to distract the user. Actually depends on what else your application does beside the listing.

    0 讨论(0)
  • 2020-12-01 12:47

    You could hack it if the 150k files all (or a significant number of them) had a similar naming convention like:

    *.jpg
    *Out.txt
    

    and only actually create file objects for the ones you are unsure about being a folder.

    0 讨论(0)
  • 2020-12-01 12:48

    The key problem could be File.isDirectory() function called in a loop.

    File.isDirectory() can be extremely slow. I saw NFS take 10 seconds to process 200 file directory.

    If you can by all means prevent File.isDirectory() calls (e.g. test for extension, no extension == directory), you could improve the performance drastically.

    Otherwise I would suggest doing JNA/JNI/writing a native script that does this for you.

    The jCifs library lets you manipulate windows network shares more efficiently. I am not aware of a library that would do this for other network file systems.

    0 讨论(0)
  • 2020-12-01 12:48

    Well, either JNI, or, if you say your deployment is constant, just run "dir" on Windows or "ls" on *nixes, with appropriate flags to list only directories (Runtime.exec())

    0 讨论(0)
  • 2020-12-01 12:49

    Maybe you could write a directory searching program in C#/C/C++ and use JNI to get it to Java. Do not know if this would improve performance or not.

    0 讨论(0)
  • 2020-12-01 12:51

    As has already been mentioned, this is basicly a hardware problem. Disk access is always slow, and most file systems aren't really designed to handle directories with that many files.

    If you for some reason have to store all the files in the same directory, I think you'll have to maintain your own cache. This could be done using a local database such as sqlite, HeidiSQL or HSQL. If you want extreme performance, use a java TreeSet and cache it in memory. This means at the very least that you'll have to read the directory less often, and it could possibly be done in the background. You could reduce the need to refresh the list even further by using your systems native file update notification API (inotify on linux) to subscribe to changes to the directory.

    This doesn't seem to be possible for you, but I once solved a similiar problem by "hashing" the files into subdirectories. In my case, the challenge was to store a couple of millions images with numeric ids. I constructed the directory structure as follows:

    images/[id - (id % 1000000)]/[id - (id % 1000)]/[id].jpg
    

    This has worked well for us, and it's the solution that I would recommend. You could do something similiar to alpha-numeric filenames by simply taking the first two letters of the filename, and then the next two letters. I've done this as well once, and it did the job as well.

    0 讨论(0)
提交回复
热议问题