Algorithm to find the minimum length of substring having all characters of other string

前端 未结 4 932
我寻月下人不归
我寻月下人不归 2021-02-04 18:15

I have a two strings:
string1 - hello how are you,
String2 - olo (including space character)

Output: lo ho ( hello

4条回答
  •  栀梦
    栀梦 (楼主)
    2021-02-04 19:12

    Keep two pointer l and r, and a hash table M = character -> count for characters in string2 that do not occur in s[l..r].

    Initially set l = 0 and r so that string1[l..r] contains all the characters of string2 (if possible). You do that by removing characters from M until it is empty.

    Then proceed by incrementing r by one in each step and then incrementing l as much as possible while still keeping M empty. The minimum over all r - l + 1 (the length of the substring s[l..r]) is the solution.

    Pythonish pseudocode:

    n = len(string1)
    M = {}   # let's say M is empty if it contains no positive values
    for c in string2:
        M[c]++
    l = 0
    r = -1
    while r + 1 < n and M not empty:
        r++
        M[string1[r]]--
    if M not empty: 
        return "no solution"
    answer_l, answer_r = l, r
    while True:
        while M[string1[l]] < 0:
            M[string1[l]]++
            l++
        if r - l + 1 < answer_r - anwer_l + 1:
            answer_l, answer_r = l, r
        r++
        if r == n:
            break
        M[string1[r]]--
    return s[answer_l..answer_r]
    

    The "is empty" checks can be implemented in O(1) if you maintain the number of positive entries when performing the increment and decrement operations.

    Let n be the length of string1 and m be the length of string2.

    Note that l and r are only ever incremented, so there are at most O(n) increments and thus at most O(n) instructions are executed in the last outer loop.

    If M is implemented as an array (I assume the alphabet is constant size), you get runtime O(n + m), which is optimal. If the alphabet is too large, you can use a hash table to get expected O(n + m).

    Example execution:

    string1 = "abbabcdbcb"
    string2 = "cbb"
    
    # after first loop
    M = { 'a': 0, 'b': 2, 'c': 1, 'd': 0 }
    
    # after second loop
    l = 0
    r = 5
    M = { 'a': -2, 'b': -1, 'c': 0, 'd': 0 }
    
    # increment l as much as possible:
    l = 2
    r = 5
    M = { 'a': -1, 'b': 0, 'c': 0, 'd': 0 }
    
    # increment r by one and then l as much as possible
    l = 2
    r = 6
    M = { 'a': -1, 'b': 0, 'c': 0, 'd': -1 }
    
    # increment r by one and then l as much as possible
    l = 4
    r = 7
    M = { 'a': 0, 'b': 0, 'c': 0, 'd': -1 }
    
    # increment r by one and then l as much as possible
    l = 4
    r = 8
    M = { 'a': 0, 'b': 0, 'c': -1, 'd': -1 }
    
    # increment r by one and then l as much as possible
    l = 7
    r = 9
    M = { 'a': 0, 'b': 0, 'c': 0, 'd': 0 }
    

    The best solution is s[7..9].

提交回复
热议问题