I am writing a little program that creates an index of all files on my directories. It basically iterates over each file on the disk and stores it into a searchable database
How about something like this:
private static String execute( String command ) throws IOException {
Process p = Runtime.getRuntime().exec( "cmd /c " + command );
InputStream i = p.getInputStream();
StringBuilder sb = new StringBuilder();
for( int c = 0 ; ( c = i.read() ) > -1 ; ) {
sb.append( ( char ) c );
}
i.close();
return sb.toString();
}
( There is a lot of room for improvement there, since that version reads one char at a time: You can pick a better version from here to read the stream faster )
And you use as argument:
"dir /b /s M:\tests\"
If this is going to be used in a running app ( rather and being an standalone app ) you can discount the "warm up" time of the JVM, that's about 1 - 2 secs depending on your hardware.
You could give it a try to see what's the impact.
Can you jump out of java.
You could simply use
dir /b /s /on M:\tests\
the /on sorts by name
if you pipe that out to out.txt
Then do a diff to the last time you ran this file either in Java or in a batch file. Something like this in Dos. You'd need to get a diff tool, either diff in cygwin or the excellent http://gnuwin32.sourceforge.net/packages/diffutils.htm
dir /b /s /on m:\tests >new.txt
diff new.txt archive.txt >diffoutput.txt
del archive.txt
ren new.txt archive.txt
Obviously you could use a java diff class as well but I think the thing to accept is that a shell command is nearly always going to beat Java at a file list operation.
I have heared that this task is very hard to do efficiently. I'm sure MS would have implemented similar tool to Windows if it was easy, especially nowadays since HD:s are growing and growing.
I haven't checked the implementation or the performance, but commons-io has an listFiles() method. It might be worth a try.