I would like to read a (fairly big) log file into a MATLAB string cell in one step. I have used the usual:
s={};
fid = fopen(\'test.txt\');
tline = fgetl(fid);
w
Use the fgetl
function instead of fread
. For more info, go here
I tend to use urlread for this, e.g.:
filename = 'test.txt';
urlname = ['file:///' fullfile(pwd,filename)];
try
str = urlread(urlname);
catch err
disp(err.message)
end
The variable str then contains a big block of text of type string (ready for regexp to operate on).
The following method is based on what Jonas proposed above, which I love very much. How ever, what we get is a cell array s. rather than a single String.
I found with one more line of codes, we can get a single string variable as below:
% original codes, thanks to Jonas
fid = fopen('test.txt');
s = textscan(fid,'%s','Delimiter','\n');
s = s{1};
% the additional one line to turn s to a string
s = cell2mat(reshape(s, 1, []));
I found it useful to prepare text for jsondecode(text). :)
s = regexp(fileread('test.txt'), '(\r\n|\n|\r)', 'split');
The seashells example in Matlab's regexp documentation is directly on-point.
The main reason your first example is slow is that s
grows in every iteration. This means recreating a new array, copying the old lines, and adding the new one, which adds unnecessary overhead.
To speed up things, you can preassign s
%# preassign s to some large cell array
s=cell(10000,1);
sizS = 10000;
lineCt = 1;
fid = fopen('test.txt');
tline = fgetl(fid);
while ischar(tline)
s{lineCt} = tline;
lineCt = lineCt + 1;
%# grow s if necessary
if lineCt > sizS
s = [s;cell(10000,1)];
sizS = sizS + 10000;
end
tline = fgetl(fid);
end
%# remove empty entries in s
s(lineCt:end) = [];
Here's a little example of what preallocation can do for you
>> tic,for i=1:100000,c{i}=i;end,toc
Elapsed time is 10.513190 seconds.
>> d = cell(100000,1);
>> tic,for i=1:100000,d{i}=i;end,toc
Elapsed time is 0.046177 seconds.
>>
EDIT
As an alternative to fgetl
, you could use TEXTSCAN
fid = fopen('test.txt');
s = textscan(fid,'%s','Delimiter','\n');
s = s{1};
This reads the lines of test.txt
as string into the cell array s
in one go.