Imagine you have a very long sequence. What is the most efficient way of finding the intervals where the sequence is all zeros (or more precisely the sequence drops to near-
I think the most MATLAB/"vectorized" way of doing it is by computing a convolution of your signal with a filter like [-1 1]. You should look at the documentation of the function conv. Then on the output of conv use find to get the relevant indexes.
These are the steps I would take to solve your problem in a vectorized way, starting with a given vector sig
:
First, threshold the vector to get a vector tsig
of zeros and ones (zeroes where the absolute value of the signal drops close enough to zero, ones elsewhere):
tsig = (abs(sig) >= eps); %# Using eps as the threshold
Next, find the starting indices, ending indices, and duration of each string of zeroes using the functions DIFF and FIND:
dsig = diff([1 tsig 1]);
startIndex = find(dsig < 0);
endIndex = find(dsig > 0)-1;
duration = endIndex-startIndex+1;
Then, find the strings of zeroes with a duration greater than or equal to some value (such as 3, from your example):
stringIndex = (duration >= 3);
startIndex = startIndex(stringIndex);
endIndex = endIndex(stringIndex);
Finally, use the method from my answer to the linked question to generate your final set of indices:
indices = zeros(1,max(endIndex)+1);
indices(startIndex) = 1;
indices(endIndex+1) = indices(endIndex+1)-1;
indices = find(cumsum(indices));
function indice=sigvec(sig,thresh)
%extend sig head and tail to avoid 0 head and 0 tail
exsig=[1,sig,1];
%convolution sig with extend sig
cvexsig=conv(exsig,ones(1,thresh));
tempsig=double(cvexsig==0);
indice=find(conv(tempsig,ones(1,thresh)))-thresh;
the above answer by genovice can be modified to find the indices of non-zero elements in a vector as:
tsig = (abs(sig) >= eps);
dsig = diff([0 tsig 0]);
startIndex = find(dsig > 0);
endIndex = find(dsig < 0)-1;
duration = endIndex-startIndex+1;
As gnovice showed, we'll do a threshold test to make "near zero" really zero:
logcl = abs(sig(:)) >= zero_tolerance;
Then find regions where the cumulative sum isn't increasing:
cs = cumsum(logcl);
islands = cs(1+thresh:end) == cs(1:end-thresh);
Remembering gnovice's great method for filling in ranges of indexes
v = zeros(1,max(endInd)+1); %# An array of zeroes v(startInd) = 1; %# Place 1 at the starts of the intervals v(endInd+1) = v(endInd+1)-1; %# Add -1 one index after the ends of the intervals indices = find(cumsum(v)); %# Perform a cumulative sum and find the nonzero entries
We note that our islands
vector already has ones in the startInd
locations, and for our purposes endInd
always comes thresh
spots later (longer runs have runs of ones in islands
)
endcap = zeros(thresh,1);
indices = find(cumsum([islands ; endcap] - [endcap ; islands]))
sig = [1 1 0 0 0 0 1 1 1 1 1 0 1 0 0 0 1 1 1 1 1 1 1 1 0 0 1 1 1 0];
logcl = abs(sig(:)) >= .1;
cs = cumsum(logcl);
islands = cs(1+thresh:end) == cs(1:end-thresh);
endcap = zeros(thresh,1);
indices = find(cumsum([islands ; endcap] - [endcap ; islands]))
indices = 2 3 4 5 13 14 15
You can solve this as a string search task, by finding strings of zeros of length thresh
(STRFIND function is very fast)
startIndex = strfind(sig, zeros(1,thresh));
Note that longer substrings will get marked in multiple locations but will eventually be joined once we add in-between locations from intervals start at startIndex
to end at start+thresh-1
.
indices = unique( bsxfun(@plus, startIndex', 0:thresh-1) )';
Note that you can always swap this last step with the CUMSUM/FIND solution by @gnovice from the linked question.