split a matrix according to a column with matlab.

后端 未结 3 1148
别那么骄傲
别那么骄傲 2021-01-14 13:57
A = [1,4,2,5,10
     2,4,5,6,2
     2,1,5,6,10
     2,3,5,4,2]

And I want split it into two matrix by the last column A ->B and C

B         


        
相关标签:
3条回答
  • 2021-01-14 14:36

    Use accumarray in combination with histc:

    % Example data (from Mohsen Nosratinia)
    A = [...
         1     4     2     5    10
         2     4     5     6     2
         2     1     5     6    10
         2     3     5     4     2
         0     3     1     4     9
         1     3     4     5     1
         1     0     4     5     9
         1     2     4     3     1];
    
    % Get the proper indices to the specific rows
    B = sort(A(:,end)); 
    [~,b] = histc(A(:,end), B([diff(B)>0;true]));
    
    % Collect all specific rows in their specific groups
    C = accumarray(b, (1:size(A,1))', [], @(r) {A(r,:)} );
    

    Results:

    >> C{:}
    ans =
         1     3     4     5     1
         1     2     4     3     1
    ans =
         2     3     5     4     2
         2     4     5     6     2
    ans =
         0     3     1     4     9
         1     0     4     5     9
    ans =
         2     1     5     6    10
         1     4     2     5    10
    

    Note that

    B = sort(A(:,end)); 
    [~,b] = histc(A(:,end), B([diff(B)>0;true]));
    

    can also be written as

    [~,b] = histc(A(:,end), unique(A(:,end)));
    

    but unique is not built-in and is therefore likely to be slower, especially when this is all used in a loop.

    Note also that the order of the rows has changed w.r.t. the order they had in the original matrix. If the order matters, you'll have to throw in another sort:

    C = accumarray(b, (1:size(A,1))', [], @(r) {A(sort(r),:)} );
    
    0 讨论(0)
  • 2021-01-14 14:38

    Here is a general approach which will work on any number of numbers in the last column on any sized matrix:

    A = [1,4,2,5,10
         2,4,5,6,2
         1,1,1,1,1
         2,1,5,6,10
         2,3,5,4,2
         0,0,0,0,2];
    

    First sort by the last column (many ways to do this, don't know if this is the best or not)

    [~, order] = sort(A(:,end));
    As = A(order,:);
    

    Then create a vector of how many rows of the same number appear in that last col (i.e. how many rows per group)

    rowDist = diff(find([1; diff(As(:, end)); 1]));
    

    Note that for my example data rowDist will equal [1 3 2] as there is 1 1, 3 2s and 2 10s. Now use mat2cell to split by these row groupings:

    Ac = mat2cell(As, rowDist);
    

    If you really want to you can now split it into separate matrices (but I doubt you would)

    Ac{:}
    

    results in

    ans =
    
       1   1   1   1   1
    
    ans =
    
       0   0   0   0   2
       2   3   5   4   2
       2   4   5   6   2
    
    ans =
    
        1    4    2    5   10
        2    1    5    6   10
    

    But I think you would find Ac itself more useful

    EDIT:

    Many solutions so might as well do a time comparison:

    A = [...
         1     4     2     5    10
         2     4     5     6     2
         2     1     5     6    10
         2     3     5     4     2
         0     3     1     4     9
         1     3     4     5     3
         1     0     4     5     9
         1     2     4     3     1];
    
    A = repmat(A, 1000, 1);
    
    tic
    for l = 1:100
      [~, y] = sort(A(:,end));
      As = A(y,:);
      rowDist = diff(find([1; diff(As(:, end)); 1]));
      Ac = mat2cell(As, rowDist);
    end
    toc
    
    tic
    for l = 1:100
      D=arrayfun(@(x) A(A(:,end)==x,:), unique(A(:,end)), 'UniformOutput', false);
    end
    toc
    
    tic
    for l = 1:100
      for k = 1:numel(e)
          B{k} = A(A(:,end)==e(k),:);
      end
    end
    toc
    
    tic
    for l = 1:100
      Bb = sort(A(:,end)); 
      [~,b] = histc(A(:,end), Bb([diff(Bb)>0;true]));
      C = accumarray(b, (1:size(A,1))', [], @(r) {A(r,:)} );
    end
    toc
    

    resulted in

    Elapsed time is 0.053452 seconds.
    Elapsed time is 0.17017 seconds.
    Elapsed time is 0.004081 seconds.
    Elapsed time is 0.22069 seconds.
    

    So for even for a large matrix the loop method is still the fastest.

    0 讨论(0)
  • 2021-01-14 14:46

    Use logical indexing

    B=A(A(:,end)==10,:);
    C=A(A(:,end)==2,:);
    

    returns

    >> B
    B =
         1     4     2     5    10
         2     1     5     6    10
    
    >> C
    C =
         2     4     5     6     2
         2     3     5     4     2
    

    EDIT: In reply to Dan's comment here is the extension for general case

    e = unique(A(:,end));
    B = cell(size(e));
    for k = 1:numel(e)
        B{k} = A(A(:,end)==e(k),:);
    end
    

    or more compact way

    B=arrayfun(@(x) A(A(:,end)==x,:), unique(A(:,end)), 'UniformOutput', false);
    

    so for

    A =
         1     4     2     5    10
         2     4     5     6     2
         2     1     5     6    10
         2     3     5     4     2
         0     3     1     4     9
         1     3     4     5     1
         1     0     4     5     9
         1     2     4     3     1
    

    you get the matrices in elements of cell array B

    >> B{1}
    ans =
         1     3     4     5     1
         1     2     4     3     1
    
    >> B{2}
    ans =
         2     4     5     6     2
         2     3     5     4     2
    
    >> B{3}
    ans =
         0     3     1     4     9
         1     0     4     5     9
    
    >> B{4}
    ans =
         1     4     2     5    10
         2     1     5     6    10
    
    0 讨论(0)
提交回复
热议问题