I have a two strings:
string1 - hello how are you
,
String2 - olo
(including space character)
Output: lo ho
( hello
I would calculate positions of characters from string2
within string1
and then pick the permutation with minimum distance between lowest and highest character position:
# positions are:
# 01234567890123456
string1 = 'hello how are you'
string2 = 'olo'
# get string1 positions for each character from set(string2)
positions = {'o': [4, 7, 15],
'l': [2, 3]}
# get all permutations of positions (don't repeat the same element)
# then pick the permutation with minimum distance between min and max position
# (obviously, this part can be optimized, this is just an illustration)
permutations = positions['o'] * positions['l'] * positions['o']
permutations = [[4,2,7], [4,3,7], [4,2,15], ...]
the_permutation = [4,3,7]
# voilà
output = string1_without_spaces[3:7]
There is this algorithm which does this in O(N).
Idea: Have 2 arrays, viz. isRequired[256] and isFound[256] which tells the frequency of each character in S and while parsing the string S, the frequency of each character that has found yet. Also, keep a counter which tells when a valid window is found. Once a valid window is found, we can shift the window (towards right) maintaining the given invariant of the question.
Program in C++:
void findMinWindow(const char *text, const char *pattern, int &start, int &end){
//Calcuate lengths of text and pattern
int textLen = strlen(text);
int patternLen = strlen(pattern);
// Declare 2 arrays which keep tab of required & found frequency of each char in pattern
int isRequired[256] ; //Assuming the character set is in ASCII
int isFound[256];
int count = 0; //For ascertaining whether a valid window is found
// Keep a tab of minimum window
int minimumWindow = INT_MAX;
//Prepare the isRequired[] array by parsing the pattern
for(int i=0;i<patternLen;i++){
isRequired[pattern[i]]++;
}
//Let's start parsing the text now
// Have 2 pointers: i and j - both starting at 0
int i=0;
int j=0;
//Keep moving j forward, keep i fixed till we get a valid window
for(c=j;c<textLen;c++){
//Check if the character read appears in pattern or not
if(isRequired[text[c]] == 0){
//This character does not appear in the pattern; skip this
continue;
}
//We have this character in the pattern, lets increment isFound for this char
isFound[text[c]]++;
//increment the count if this character satisfies the invariant
if(isFound[text[c]] <= isRequired[text[c]]){
count++;
}
//Did we find a valid window yet?
if(count == patternLen){
//A valid window is found..lets see if we can do better from here on
//better means: increasing i to reduce window length while maintaining invariant
while(isRequired[s[i]] == 0 || isFound[s[i]] > isRequired[s[i]]){
//Either of the above 2 conditions means we should increment i; however we
// must decrease isFound for this char as well.
//Hence do a check again
if(isFound[s[i]] > isRequired[s[i]]){
isFound[s[i]]--;
}
i++;
}
// Note that after the while loop, the invariant is still maintained
// Lets check if we did better
int winLength = j-i+1;
if(winLength < minimumWindow){
//update the references we got
begin = i;
end = j;
//Update new minimum window lenght
minimumWindow = winLength;
}
} //End of if(count == patternLen)
} //End of for loop
}
Keep two pointer l
and r
, and a hash table M = character -> count
for characters in string2 that do not occur in s[l..r]
.
Initially set l = 0
and r
so that string1[l..r]
contains all the characters of string2
(if possible). You do that by removing characters from M until it is empty.
Then proceed by incrementing r
by one in each step and then incrementing l
as much as possible while still keeping M empty. The minimum over all r - l + 1
(the length of the substring s[l..r]
) is the solution.
Pythonish pseudocode:
n = len(string1)
M = {} # let's say M is empty if it contains no positive values
for c in string2:
M[c]++
l = 0
r = -1
while r + 1 < n and M not empty:
r++
M[string1[r]]--
if M not empty:
return "no solution"
answer_l, answer_r = l, r
while True:
while M[string1[l]] < 0:
M[string1[l]]++
l++
if r - l + 1 < answer_r - anwer_l + 1:
answer_l, answer_r = l, r
r++
if r == n:
break
M[string1[r]]--
return s[answer_l..answer_r]
The "is empty" checks can be implemented in O(1) if you maintain the number of positive entries when performing the increment and decrement operations.
Let n
be the length of string1
and m
be the length of string2
.
Note that l
and r
are only ever incremented, so there are at most O(n) increments and thus at most O(n) instructions are executed in the last outer loop.
If M
is implemented as an array (I assume the alphabet is constant size), you get runtime
O(n + m), which is optimal. If the alphabet is too large, you can use a hash table to get expected O(n + m).
Example execution:
string1 = "abbabcdbcb"
string2 = "cbb"
# after first loop
M = { 'a': 0, 'b': 2, 'c': 1, 'd': 0 }
# after second loop
l = 0
r = 5
M = { 'a': -2, 'b': -1, 'c': 0, 'd': 0 }
# increment l as much as possible:
l = 2
r = 5
M = { 'a': -1, 'b': 0, 'c': 0, 'd': 0 }
# increment r by one and then l as much as possible
l = 2
r = 6
M = { 'a': -1, 'b': 0, 'c': 0, 'd': -1 }
# increment r by one and then l as much as possible
l = 4
r = 7
M = { 'a': 0, 'b': 0, 'c': 0, 'd': -1 }
# increment r by one and then l as much as possible
l = 4
r = 8
M = { 'a': 0, 'b': 0, 'c': -1, 'd': -1 }
# increment r by one and then l as much as possible
l = 7
r = 9
M = { 'a': 0, 'b': 0, 'c': 0, 'd': 0 }
The best solution is s[7..9].
This is a example of implementation with JavaScript. The logic is similar as @Aprillion wrote above.
DEMO : http://jsfiddle.net/ZB6vm/4/
var s1 = "hello how are you";
var s2 = "olo";
var left, right;
var min_distance;
var answer = "";
// make permutation recursively
function permutate(ar, arrs, k) {
// check if the end of recursive call
if (k == arrs.length) {
var r = Math.max.apply(Math, ar);
var l = Math.min.apply(Math, ar);
var dist = r - l + 1;
if (dist <= min_distance) {
min_distance = dist;
left = l;
right = r;
}
return;
}
for (var i in arrs[k]) {
var v = arrs[k][i];
if ($.inArray(v, ar) < 0) {
var ar2 = ar.slice();
ar2.push(v);
// recursive call
permutate(ar2, arrs, k + 1);
}
}
}
function solve() {
var ar = []; // 1-demension array to store character position
var arrs = []; // 2-demension array to store character position
for (var i = 0; i < s2.length; i++) {
arrs[i] = [];
var c = s2.charAt(i);
for (var k = 0; k < s1.length; k++) { // loop by s1
if (s1.charAt(k) == c) {
if ($.inArray(k, arrs[i]) < 0) {
arrs[i].push(k); // save position found
}
}
}
}
// call permutate
permutate(ar, arrs, 0);
answer = s1.substring(left, right + 1);
alert(answer);
}
solve();
Hope this helps.