a faster way to achieve what intersect() is giving me?

匿名 (未验证) 提交于 2019-12-03 03:06:01

问题:

I am finding that a lot of time spent in my matlab function is in this code:

intersect(freq_bins, our_bins); 

Both can be rather large vectors, and are comprised of only integers. I just need to know which integers are in both. This is truly the primitive purpose of intersect(), so I suspect that the answer is: it doesn't get any better. But maybe someone has some suggestions.

回答1:

intersect calls ismember. In your case, you don't need all the complicated checks that intersect does, so you can save some overhead and call ismember directly (note: I made sure to call both functions before timing them):

a = randi(1000,100,1); b = randi(1000,100,1);  >> tic,intersect(a,b),toc ans =     76    338    490    548    550    801    914    930 Elapsed time is 0.027104 seconds.  >> tic,a(ismember(a,b)),toc ans =    914    801    490    548    930    550     76    338 Elapsed time is 0.000613 seconds. 

You can make this even faster by calling ismembc, the function that does the actual testing, directly. Note that ismembc requires sorted arrays (so you can drop the sort if your input is sorted already!)

tic,a=sort(a);b=sort(b);a(ismembc(a,b)),toc ans =     76    338    490    548    550    801    914    930 Elapsed time is 0.000473 seconds. 


回答2:

If you can assume that your inputs contain sorted lists of unique integers, then you can do this in linear time with a very simple algorithm:

function c = intersect_sorted(a,b)   ia = 1;   na = length(a);   ib = 1;   nb = length(b);   ic = 0;   cn = min(na,nb);   c = zeros(1,cn);    while (ia <= na && ib <= nb)     if (a(ia) > b(ib))       ib = ib + 1;     elseif a(ia) < b(ib)       ia = ia + 1;     else % a(ia) == b(ib)       ic = ic + 1;       c(ic) = a(ia);       ib = ib + 1;       ia = ia + 1;     end   end   c = c(1:ic); end 

The max runtime for lists of length n and m will be O(n+m).

>>a = unique(randi(1000,100,1)); >>b = unique(randi(1000,100,1)); >>tic;for i = 1:10000, intersect(a,b); end,toc Elapsed time is 1.224514 seconds. >> tic;for i = 1:10000, intersect_sorted(a,b); end,toc Elapsed time is 0.289075 seconds. 


标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!