Given a sequence such as S = {1,8,2,1,4,1,2,9,1,8,4}, I need to find the minimal-length subsequence that contains all element of S (no duplicates, order does n
I've got a O(N*M) algorithm where N is the length of S, and M is the number of elements (it tend to works better for small values of M, i.e : if there are very few duplicates, it may be a bad algorithm with quadratic cost) Edit : It seems that in fact, it's much closer to O(N) in practise. You get O(N*M)
only in worst case scenarios
Start by going through the sequence and record all the elements of S. Let's call this set E.
We're going to work with a dynamic subsequence of S. Create an empty map
M where M associates to each element the number of times it is present in the subsequence.
For example, if subSequence = {1,8,2,1,4}
, and E = {1, 2, 4, 8, 9}
M[9]==0
M[2]==M[4]==M[8]==1
M[1]==2
You'll need two index, that will each point to an element of S. One of them will be called L because he's at the left of the subsequence formed by those two indexes. The other one will be called R as it's the index of the right part of the subsequence.
Begin by initializing L=0
,R=0
and M[S[0]]++
The algorithm is :
While(M does not contain all the elements of E)
{
if(R is the end of S)
break
R++
M[S[R]]++
}
While(M contains all the elements of E)
{
if(the subsequence S[L->R] is the shortest one seen so far)
Record it
M[S[L]]--
L++
}
To check if M contains all the elements of E, you can have a vector of booleans V. V[i]==true
if M[E[i]]>0
and V[i]==false
if M[E[i]]==0
. So you begin by setting all the values of V at false
, and each time you do M[S[R]]++
, you can set V of this element to true
, and each time you do M[S[L]]--
and M[S[L]]==0
then set V of this element to false