A = [1,4,2,5,10
2,4,5,6,2
2,1,5,6,10
2,3,5,4,2]
And I want split it into two matrix by the last column A ->B and C
B
Use accumarray
in combination with histc
:
% Example data (from Mohsen Nosratinia)
A = [...
1 4 2 5 10
2 4 5 6 2
2 1 5 6 10
2 3 5 4 2
0 3 1 4 9
1 3 4 5 1
1 0 4 5 9
1 2 4 3 1];
% Get the proper indices to the specific rows
B = sort(A(:,end));
[~,b] = histc(A(:,end), B([diff(B)>0;true]));
% Collect all specific rows in their specific groups
C = accumarray(b, (1:size(A,1))', [], @(r) {A(r,:)} );
Results:
>> C{:}
ans =
1 3 4 5 1
1 2 4 3 1
ans =
2 3 5 4 2
2 4 5 6 2
ans =
0 3 1 4 9
1 0 4 5 9
ans =
2 1 5 6 10
1 4 2 5 10
Note that
B = sort(A(:,end));
[~,b] = histc(A(:,end), B([diff(B)>0;true]));
can also be written as
[~,b] = histc(A(:,end), unique(A(:,end)));
but unique
is not built-in and is therefore likely to be slower, especially when this is all used in a loop.
Note also that the order of the rows has changed w.r.t. the order they had in the original matrix. If the order matters, you'll have to throw in another sort
:
C = accumarray(b, (1:size(A,1))', [], @(r) {A(sort(r),:)} );
Here is a general approach which will work on any number of numbers in the last column on any sized matrix:
A = [1,4,2,5,10
2,4,5,6,2
1,1,1,1,1
2,1,5,6,10
2,3,5,4,2
0,0,0,0,2];
First sort by the last column (many ways to do this, don't know if this is the best or not)
[~, order] = sort(A(:,end));
As = A(order,:);
Then create a vector of how many rows of the same number appear in that last col (i.e. how many rows per group)
rowDist = diff(find([1; diff(As(:, end)); 1]));
Note that for my example data rowDist
will equal [1 3 2]
as there is 1 1
, 3 2
s and 2 10
s.
Now use mat2cell
to split by these row groupings:
Ac = mat2cell(As, rowDist);
If you really want to you can now split it into separate matrices (but I doubt you would)
Ac{:}
results in
ans =
1 1 1 1 1
ans =
0 0 0 0 2
2 3 5 4 2
2 4 5 6 2
ans =
1 4 2 5 10
2 1 5 6 10
But I think you would find Ac
itself more useful
EDIT:
Many solutions so might as well do a time comparison:
A = [...
1 4 2 5 10
2 4 5 6 2
2 1 5 6 10
2 3 5 4 2
0 3 1 4 9
1 3 4 5 3
1 0 4 5 9
1 2 4 3 1];
A = repmat(A, 1000, 1);
tic
for l = 1:100
[~, y] = sort(A(:,end));
As = A(y,:);
rowDist = diff(find([1; diff(As(:, end)); 1]));
Ac = mat2cell(As, rowDist);
end
toc
tic
for l = 1:100
D=arrayfun(@(x) A(A(:,end)==x,:), unique(A(:,end)), 'UniformOutput', false);
end
toc
tic
for l = 1:100
for k = 1:numel(e)
B{k} = A(A(:,end)==e(k),:);
end
end
toc
tic
for l = 1:100
Bb = sort(A(:,end));
[~,b] = histc(A(:,end), Bb([diff(Bb)>0;true]));
C = accumarray(b, (1:size(A,1))', [], @(r) {A(r,:)} );
end
toc
resulted in
Elapsed time is 0.053452 seconds.
Elapsed time is 0.17017 seconds.
Elapsed time is 0.004081 seconds.
Elapsed time is 0.22069 seconds.
So for even for a large matrix the loop method is still the fastest.
Use logical indexing
B=A(A(:,end)==10,:);
C=A(A(:,end)==2,:);
returns
>> B
B =
1 4 2 5 10
2 1 5 6 10
>> C
C =
2 4 5 6 2
2 3 5 4 2
EDIT: In reply to Dan's comment here is the extension for general case
e = unique(A(:,end));
B = cell(size(e));
for k = 1:numel(e)
B{k} = A(A(:,end)==e(k),:);
end
or more compact way
B=arrayfun(@(x) A(A(:,end)==x,:), unique(A(:,end)), 'UniformOutput', false);
so for
A =
1 4 2 5 10
2 4 5 6 2
2 1 5 6 10
2 3 5 4 2
0 3 1 4 9
1 3 4 5 1
1 0 4 5 9
1 2 4 3 1
you get the matrices in elements of cell array B
>> B{1}
ans =
1 3 4 5 1
1 2 4 3 1
>> B{2}
ans =
2 4 5 6 2
2 3 5 4 2
>> B{3}
ans =
0 3 1 4 9
1 0 4 5 9
>> B{4}
ans =
1 4 2 5 10
2 1 5 6 10