When there are many files, around 4000, dir()
function is very slow. My guess is it creates a structure and filling in the values in an inefficient way.
Are there any fast and elegant alternatives to using dir()
?
Update: Testing it in 64 Bit, Windows 7 with MATLAB R2011a.
Update 2: It takes around 2 seconds to complete.
Which CPU / OS are you using? I just tried it on my machine with a directory with 5000 files and it's pretty quick:
>> d=dir;
>> tic; d=dir; toc;
Elapsed time is 0.062197 seconds.
>> tic; d=ls; toc;
Elapsed time is 0.139762 seconds.
>> tic; d=dir; toc;
Elapsed time is 0.058590 seconds.
>> tic; d=ls; toc;
Elapsed time is 0.063663 seconds.
>> length(d)
ans =
5002
The other alternative to MATLAB's ls and dir functions is to directly use Java's java.io.File
in MATLAB:
>> f0=java.io.File('.');
>> tic; x=f0.listFiles(); toc;
Elapsed time is 0.006441 seconds.
>> length(x)
ans =
5000
Confirmed Jason S's suggestion for a networked drive and for a directory containing 363 files. Win7 64-bit Matlab 2011a.
Both foo
and bar
below yield the same cell array of filenames (verified using MD5 hashing of the data), but bar
using Java takes significantly less time. Similar results are seen if I generate bar
first and then foo
, so this isn't a network caching phenomenon.
>> tic; foo=dir('U:\mydir'); foo={foo(3:end).name}; toc Elapsed time is 20.503934 seconds. >> tic;bar=cellf(@(f) char(f.toString()), java.io.File('U:\mydir').list())';toc Elapsed time is 0.833696 seconds. >> DataHash(foo) ans = 84c7b70ee60ca162f5bc0a061e731446 >> DataHash(bar) ans = 84c7b70ee60ca162f5bc0a061e731446
where cellf = @(fun, arr) cellfun(fun, num2cell(arr), 'uniformoutput',0);
and DataHash
is from http://www.mathworks.com/matlabcentral/fileexchange/31272. I skip the first two elements of the array returned by dir
because they correspond to .
and ..
.
You can try LS. It returns only file names in character array. I didn't test if it faster than DIR.
UPDATE:
I checked on a directory with over 4000 files. Both dir
and ls
show similar results: about 0.34 sec. Which is not bad I think. (MATLAB 2011a, Windows 7 64-bit)
Is your directory located on a local hard drive or network? May be defragmenting the hard drive will help?
%Example: list files and folders
Folder = 'C:\'; %can be a relative path
jFile = java.io.File(Folder); %java file object
Names_Only = cellstr(char(jFile.list)) %cellstr
Full_Paths = arrayfun(@char,jFile.listFiles,'un',0) %cellstr
%Example: list files (skip folders)
Folder = 'C:\';
jFile = java.io.File(Folder); %java file object
jPaths = jFile.listFiles; %java.io.File objects
jNames = jFile.list; %java.lang.String objects
isFolder = arrayfun(@isDirectory,jPaths); %boolean
File_Names_Only = cellstr(char(jNames(~isFolder))) %cellstr
%Example: simple filter
Folder = 'C:\';
jFile = java.io.File(Folder); %java file object
jNames = jFile.list; %java string objects
Match = arrayfun(@(f)f.startsWith('page')&f.endsWith('.sys'),jNames); %boolean
cellstr(char(jNames(Match))) %cellstr
%Example: list all class methods
methods(handle(jPaths(1)))
methods(handle(jNames(1)))
来源:https://stackoverflow.com/questions/6385531/very-slow-dir