I am trying to speed up a script that I have written in Matlab that dynamically allocates memory to a matrix (basicallly reads a line of data from a file and writes it into
To solve your error, simply use this syntax when allocating
data = [data; zeroes(1000, size(data,2))];
You might want to read the first line outside the loop so you'll know the number of columns and make the first allocation for data
.
What I recommend doing, if you know the number of lines and can just guess a large enough number of acceptable columns, use a sparse matrix
.
% create a sparse matrix
mat = sparse(numRows,numCols)
A sparse matrix will not store all of the zero elements, it only stores pointers to indices that are non-zero. This can help save a lot of space. They are used and accessed the same as any other matrix. That is only if you really need it in a matrix format from the beginning.
If not, you can just do everything as a cell
. Preallocate a cell array
with as many elements as lines in your file.
data = cell(1,numLines);
% get matrix from line
for i = 1:numLines
% get matrix from line
data{i} = lineData;
end
data = cell2mat(data);
This method will put everything into a cell array, which can store "dynamically" and then be converted to a regular matrix.
Addition
If you are doing the sparse matrix method, to trim up your matrix once you are done, because your matrix will likely be larger than necessary, you can trim this down easily, and then cast it to a regular matrix.
[val,~] = max(sum(mat ~= 0,2));
mat(:,val:size(mat,2)) = [];
mat = full(mat); % use this only if you really need the full matrix
This will remove any unnecessary columns and then cast it to a full matrix that includes the 0 elements. I would not recommend casting it to a full matrix, as this requires a ton more space, but if you truly need it, use it.
UPDATE
To get the number of lines in a file easily, use MATLAB's perl interpretter
create a file called countlines.pl
and paste in the two lines below
while (<>) {};
print $.,"\n";
Then you can run this script on your file as follows
numLines = str2double(perl('countlines.pl','data.csv'));
Problem solved.
From MATLAB forum thread here
remember it is always best to preallocate everything before hand, because technically when doing shai's method you are reallocating large amounts a lot, especially if it is a large file.
If you want to stick to your code as written I would substitute your initialization of data, data = []
to
data = zeros(1,1000);
Keep in mind though the warning from @MZimmerman6: zeros(1000)
generates a 1000 x 1000 array. You may want to change all of your zeros
statements to zeros( ... ,Nc)
, where Nc = length of line in characters.