Fastest way of finding repeated values in different cell arrays of different size

问题

The problem is the following:

I have a cell array of the form indx{jj} where each jj is an array of 1xNjj, meaning they all have different size. In my case max(jj)==3, but lets consider a general case for the shake of it.

How would you find the value(s) repeated in all the jj i the fastest way?

I can guess how to do it with several for loops, but is there a "one (three?) liner"?

Simple example:

indx{1}=[ 1 3 5 7 9];
indx{2}=[ 2 3 4 1];
indx{3}=[ 1 2 5 3 3 5 4];


ans=[1 3];

回答1:

Almost no-loop approach (almost because cellfun essentially uses loop(s) inside it, but it's effect here is minimal as we are using it to find just the number of elements in each cell) -

lens = cellfun(@numel,indx);

val_ind = bsxfun(@ge,lens,[1:max(lens)]');
vals = horzcat(indx{:});

mat1(max(lens),numel(lens))=0;
mat1(val_ind) = vals;

unqvals = unique(vals);
out = unqvals(all(any(bsxfun(@eq,mat1,permute(unqvals,[1 3 2]))),2));

回答2:

One possibility is to use a for loop with intersect:

result = indx{1}; %// will be changed
for n = 2:numel(indx)
    result = intersect(result, indx{n});
end

回答3:

Another possibility that I could suggest, though Luis Mendo's answer is very good, is to take all of the vectors in your cell array and remove the duplicates. This can be done through cellfun, and specifying unique as the function to operate on. You'd have to set the UniformOutput flag to false as we are outputting a cell array at each index. You also have to be careful in that each cell array is assumed to be all row vectors, or all column vectors. You can't mix the way the arrays are shaped or this method won't work.

Once you do this, concatenate all of the vectors into a single array through cell2mat, then do a histogram through histc. You'd specify the edges to be all of the unique numbers in the single array created before. Note that you'd have to make an additional call to unique on the output single array before proceeding. Once you calculate the histogram, for any entries with a bin count equal to the total number of elements in your cell array (which is 3 in your case), then these are values that you see in all of your cells. As such:

A = cell2mat(cellfun(@unique, indx, 'uni', 0));
edge_values = unique(A);
h = histc(A, edge_values);
result = edge_values(h == numel(indx));

With the unique call for each cell array, if a number appears in every single cell, then the total number of times you see this number should equal the total number of cells you have.

来源：https://stackoverflow.com/questions/25851305/fastest-way-of-finding-repeated-values-in-different-cell-arrays-of-different-siz

标签

matlab

cell-array