I have a directory with a large number of files (~1mil). I need to choose a random file from this directory. Since there are so many files, os.listdir
naturally tak
I have a similar need to the OP.
I think I will adopt a method of precaching: you store in a .txt file the list of all the files, then you can just do a clever seeking of a random line in your listing (without even having to load it in memory), and you're done!
Of course, you still have to update the cache, and more importantly define when you need to update the cache, but depending on your needs, it may be easy (just after a specific action, or when something changed, etc..).
A code to cleverly read a random line from a file, in Python, by Jonathan Kupferman:
http://www.regexprn.com/2008/11/read-random-line-in-large-file-in.html