Fastest way of finding repeated values in different cell arrays of different size

走远了吗. 提交于 2019-12-18 06:59:18

问题


The problem is the following:

I have a cell array of the form indx{jj} where each jj is an array of 1xNjj, meaning they all have different size. In my case max(jj)==3, but lets consider a general case for the shake of it.

How would you find the value(s) repeated in all the jj i the fastest way?

I can guess how to do it with several for loops, but is there a "one (three?) liner"?

Simple example:

indx{1}=[ 1 3 5 7 9];
indx{2}=[ 2 3 4 1];
indx{3}=[ 1 2 5 3 3 5 4];


ans=[1 3];

回答1:


Almost no-loop approach (almost because cellfun essentially uses loop(s) inside it, but it's effect here is minimal as we are using it to find just the number of elements in each cell) -

lens = cellfun(@numel,indx);

val_ind = bsxfun(@ge,lens,[1:max(lens)]');
vals = horzcat(indx{:});

mat1(max(lens),numel(lens))=0;
mat1(val_ind) = vals;

unqvals = unique(vals);
out = unqvals(all(any(bsxfun(@eq,mat1,permute(unqvals,[1 3 2]))),2));



回答2:


One possibility is to use a for loop with intersect:

result = indx{1}; %// will be changed
for n = 2:numel(indx)
    result = intersect(result, indx{n});
end



回答3:


Another possibility that I could suggest, though Luis Mendo's answer is very good, is to take all of the vectors in your cell array and remove the duplicates. This can be done through cellfun, and specifying unique as the function to operate on. You'd have to set the UniformOutput flag to false as we are outputting a cell array at each index. You also have to be careful in that each cell array is assumed to be all row vectors, or all column vectors. You can't mix the way the arrays are shaped or this method won't work.

Once you do this, concatenate all of the vectors into a single array through cell2mat, then do a histogram through histc. You'd specify the edges to be all of the unique numbers in the single array created before. Note that you'd have to make an additional call to unique on the output single array before proceeding. Once you calculate the histogram, for any entries with a bin count equal to the total number of elements in your cell array (which is 3 in your case), then these are values that you see in all of your cells. As such:

A = cell2mat(cellfun(@unique, indx, 'uni', 0));
edge_values = unique(A);
h = histc(A, edge_values);
result = edge_values(h == numel(indx));

With the unique call for each cell array, if a number appears in every single cell, then the total number of times you see this number should equal the total number of cells you have.



来源:https://stackoverflow.com/questions/25851305/fastest-way-of-finding-repeated-values-in-different-cell-arrays-of-different-siz

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!