A range intersection algorithm better than O(n)?

前端 未结 9 711
悲&欢浪女
悲&欢浪女 2020-12-04 18:27

Range intersection is a simple, but non-trivial problem.

Its has been answered twice already:

  • Find number range intersection
  • Comparing date ra
相关标签:
9条回答
  • 2020-12-04 19:10

    Non Overlapping Ranges:

    Prep O(n log n):

    1. Make a array / vector of the ranges.
    2. Sort the vector by the end of the range (break ties by sorting by the start of the range)

    Search:

    1. Use binary search to find the first range with an End value of >= TestRange.Start
    2. Iterator starting at the binary search until you find an Start > TestRange.End:

      2a. If the range if the current range is within the TestRange, add it to your result.

    0 讨论(0)
  • 2020-12-04 19:13

    The standard approach is to use an interval tree.

    In computer science, an interval tree is a tree data structure to hold intervals. Specifically, it allows one to efficiently find all intervals that overlap with any given interval or point. It is often used for windowing queries, for instance, to find all roads on a computerized map inside a rectangular viewport, or to find all visible elements inside a three-dimensional scene. A similar data structure is the segment tree.

    The trivial solution is to visit each interval and test whether it intersects the given point or interval, which requires O(n) time, where n is the number of intervals in the collection. Since a query may return all intervals, for example if the query is a large interval intersecting all intervals in the collection, this is asymptotically optimal; however, we can do better by considering output-sensitive algorithms, where the runtime is expressed in terms of m, the number of intervals produced by the query. Interval trees have a query time of O(log n + m) and an initial creation time of O(n log n), while limiting memory consumption to O(n). After creation, interval trees may be dynamic, allowing efficient insertion and deletion of an interval in O(log n). If the endpoints of intervals are within a small integer range (e.g., in the range [1,...,O(n)]), faster data structures exist[1] with preprocessing time O(n) and query time O(1+m) for reporting m intervals containing a given query point.

    0 讨论(0)
  • 2020-12-04 19:16

    Sounds like you need a class that implements the SortedSet interface. TreeSet is the implementation that ships with the core API.

    Have one set holding the ranges sorted by lowest value, and one sorted by highest value.

    You can then implement the equivalent of the database algorithm using the in-memory sets.

    As for whether this is actually faster than O(n), I couldn't say.

    0 讨论(0)
提交回复
热议问题