Algorithm to find the smallest snippet from searching a document?

前端 未结 7 1435
太阳男子
太阳男子 2021-01-30 07:57

I\'ve been going through Skiena\'s excellent \"The Algorithm Design Manual\" and got hung up on one of the exercises.

The question is: \"Given a search string of three w

7条回答
  •  轻奢々
    轻奢々 (楼主)
    2021-01-30 08:05

    The other answers are alright, but like me, if you're having trouble understanding the question in the first place, those aren't really helpful. Let's rephrase the question:

    Given three sets of integers (call them A, B, and C), find the minimum contiguous range that contains one element from each set.

    There is some confusion about what the three sets are. The 2nd edition of the book states them as {1, 4, 5}, {4, 9, 10}, and {5, 6, 15}. However, another version that has been stated in a comment above is {1, 4, 5}, {3, 9, 10}, and {2, 6, 15}. If one word is not a suffix/prefix of another, version 1 isn't possible, so let's go with the second one.

    Since a picture is worth a thousand words, lets plot the points:

    Simply inspecting the above visually, we can see that there are two answers to this question: [1,3] and [2,4], both of size 3 (three points in each range).

    Now, the algorithm. The idea is to start with the smallest valid range, and incrementally try to shrink it by moving the left boundary inwards. We will use zero-based indexing.

    MIN-RANGE(A, B, C)
      i = j = k = 0
      minSize = +∞
    
      while i, j, k is a valid index of the respective arrays, do
        ans = (A[i], B[j], C[k])
        size = max(ans) - min(ans) + 1
        minSize = min(size, minSize)
        x = argmin(ans)
        increment x by 1
      done
    
      return minSize
    

    where argmin is the index of the smallest element in ans.

    +---+---+---+---+--------------------+---------+
    | n | i | j | k | (A[i], B[j], C[k]) | minSize |
    +---+---+---+---+--------------------+---------+
    | 1 | 0 | 0 | 0 | (1, 3, 2)          | 3       |
    +---+---+---+---+--------------------+---------+
    | 2 | 1 | 0 | 0 | (4, 3, 2)          | 3       |
    +---+---+---+---+--------------------+---------+
    | 3 | 1 | 0 | 1 | (4, 3, 6)          | 4       |
    +---+---+---+---+--------------------+---------+
    | 4 | 1 | 1 | 1 | (4, 9, 6)          | 6       |
    +---+---+---+---+--------------------+---------+
    | 5 | 2 | 1 | 1 | (5, 9, 6)          | 5       |
    +---+---+---+---+--------------------+---------+
    | 6 | 3 | 1 | 1 |                    |         |
    +---+---+---+---+--------------------+---------+
    

    n = iteration

    At each step, one of the three indices is incremented, so the algorithm is guaranteed to eventually terminate. In the worst case, i, j, and k are incremented in that order, and the algorithm runs in O(n^2) (9 in this case) time. For the given example, it terminates after 5 iterations.

提交回复
热议问题