Finding all indices by ismember

后端 未结 5 1010
野性不改
野性不改 2021-02-04 11:18

This is what is described in one of the examples for ismember:

Define two vectors with values in common.

A = [5 3 4 2]; B = [2 4 4

5条回答
  •  傲寒
    傲寒 (楼主)
    2021-02-04 11:55

    The most elegant solutions (i.e. those without using iterations of find) involve swapping the inputs to ismember and grouping like indexes with accumarray, as in Eitan's answer, or vectorizing the find with bsxfun as in Luis Mendo's answer, IMHO.

    However, for those interested in a solution with undocumented functionality, and an admittedly hackish approach, here is another way to do it (i.e. for each element of A, find the indexes of all corresponding elements in B). The thinking goes as follows: In a sorted B, what if you had the first and last indexes of each matching element? It turns out there are two helper functions used by ismember (if you have R2012b+, I think) that will give you both of these indexes: _ismemberfirst (a builtin) and ismembc2.

    For the example data A = [5 3 4 2]; B = [2 4 4 4 6 8]; in the question, here is the implementation:

    [Bs,sortInds] = sort(B); % nop for this B, but required in general
    firstInds = builtin('_ismemberfirst',A,Bs) % newish version required
    firstInds =
         0     0     2     1
    lastInds = ismembc2(A,Bs)
    lastInds =
         0     0     4     1
    

    The heavy lifting is now done - We have the first and last indexes in B for each element in A without having to do any looping. There is no occurrence of A(1) or A(2) (5 or 3) in B, so those indexes are 0. The value 4 (A(3)) occurs at locations 2:4 (i.e. all(B(2:4)==A(3))). Similarly, A(4) is found at B(1:1).

    We are able to ignore sortInds in the above example since B is already sorted, but an unsorted B is handled by simply looking up the locations in the unsorted array. We can quickly do this lookup and package each range of indexes with arrayfun, keeping in mind that the computationally intensive task of actually finding the indexes is already done:

    allInds = arrayfun(@(x,y)sortInds(x:y-(x==0)),firstInds,lastInds,'uni',0)
    allInds = 
        [1x0 double]    [1x0 double]    [1x3 double]    [1]
    

    Each cell has the indexes in B (if any) of each element of A. The first two cells are empty arrays, as expected. Looking closer at the third element:

    >> allInds{3}
    ans =
         2     3     4
    >> A(3)
    ans =
         4
    >> B(allInds{3})
    ans =
         4     4     4
    

    Testing operation with unsorted B:

    B(4:5) = B([5 4])
    B =
         2     4     4     6     4     8
    
    [Bs,sortInds] = sort(B);
    firstInds = builtin('_ismemberfirst',A,Bs);
    lastInds = ismembc2(A,Bs);
    allInds = arrayfun(@(x,y)sortInds(x:y-(x==0)),firstInds,lastInds,'uni',0);
    
    allInds{3} % of A(3) in B
    ans =
         2     3     5
    
    B(allInds{3})
    ans =
         4     4     4
    

    Is it worth doing it this way, with a penalty for sort and two effective ismember calls? Maybe not, but I think it's an interesting solution. If you have a sorted B, it's even faster since the two built-in functions assume the second argument (Bs) is sorted and waste no time with checks. Try and see what works for you.

提交回复
热议问题