Algorithm to find the smallest snippet from searching a document?

前端未结

关注

 7  1435

太阳男子 2021-01-30 07:57

I\'ve been going through Skiena\'s excellent \"The Algorithm Design Manual\" and got hung up on one of the exercises.

The question is: \"Given a search string of three w

7条回答

轻奢々 (楼主)

2021-01-30 08:05
The other answers are alright, but like me, if you're having trouble understanding the question in the first place, those aren't really helpful. Let's rephrase the question:

Given three sets of integers (call them A, B, and C), find the minimum contiguous range that contains one element from each set.

There is some confusion about what the three sets are. The 2nd edition of the book states them as {1, 4, 5}, {4, 9, 10}, and {5, 6, 15}. However, another version that has been stated in a comment above is {1, 4, 5}, {3, 9, 10}, and {2, 6, 15}. If one word is not a suffix/prefix of another, version 1 isn't possible, so let's go with the second one.

Since a picture is worth a thousand words, lets plot the points:

Simply inspecting the above visually, we can see that there are two answers to this question: [1,3] and [2,4], both of size 3 (three points in each range).

Now, the algorithm. The idea is to start with the smallest valid range, and incrementally try to shrink it by moving the left boundary inwards. We will use zero-based indexing.
```
MIN-RANGE(A, B, C)
  i = j = k = 0
  minSize = +∞

  while i, j, k is a valid index of the respective arrays, do
    ans = (A[i], B[j], C[k])
    size = max(ans) - min(ans) + 1
    minSize = min(size, minSize)
    x = argmin(ans)
    increment x by 1
  done

  return minSize
```
where argmin is the index of the smallest element in ans.
```
+---+---+---+---+--------------------+---------+
| n | i | j | k | (A[i], B[j], C[k]) | minSize |
+---+---+---+---+--------------------+---------+
| 1 | 0 | 0 | 0 | (1, 3, 2)          | 3       |
+---+---+---+---+--------------------+---------+
| 2 | 1 | 0 | 0 | (4, 3, 2)          | 3       |
+---+---+---+---+--------------------+---------+
| 3 | 1 | 0 | 1 | (4, 3, 6)          | 4       |
+---+---+---+---+--------------------+---------+
| 4 | 1 | 1 | 1 | (4, 9, 6)          | 6       |
+---+---+---+---+--------------------+---------+
| 5 | 2 | 1 | 1 | (5, 9, 6)          | 5       |
+---+---+---+---+--------------------+---------+
| 6 | 3 | 1 | 1 |                    |         |
+---+---+---+---+--------------------+---------+
```
n = iteration

At each step, one of the three indices is incremented, so the algorithm is guaranteed to eventually terminate. In the worst case, i, j, and k are incremented in that order, and the algorithm runs in O(n^2) (9 in this case) time. For the given example, it terminates after 5 iterations.
0 讨论(0)

查看其它7个回答
发布评论:

提交评论
- 加载中...