Find number range intersection

前端 未结 5 1181
别那么骄傲
别那么骄傲 2020-12-10 05:43

What is the best way to find out whether two number ranges intersect?

My number range is 3023-7430, now I want to test which of the following number

相关标签:
5条回答
  • 2020-12-10 06:20

    If you're using Java Commons Lang Range has a overlapsRange(Range range) method.

    0 讨论(0)
  • 2020-12-10 06:22

    In python

    class nrange(object):
        def __init__(self, lower = None, upper = None):
            self.lower = lower
            self.upper = upper
        def intersection(self, aRange):
            if self.upper < aRange.lower or aRange.upper < self.lower:
                return None
            else:
                return nrange(max(self.lower,aRange.lower), \
                              min(self.upper,aRange.upper))
    
    0 讨论(0)
  • 2020-12-10 06:24

    I would make a Range class and give it a method boolean intersects(Range) . Then you can do a

    foreach(Range r : rangeset) { if (range.intersects(r)) res.add(r) }
    

    or, if you use some Java 8 style functional programming for clarity:

    rangeset.stream().filter(range::intersects).collect(Collectors.toSet())
    

    The intersection itself is something like

    this.start <= other.end && this.end >= other.start
    
    0 讨论(0)
  • 2020-12-10 06:28

    Just a pseudo code guess:

    Set<Range> determineIntersectedRanges(Range range, Set<Range> setofRangesToTest)
    {
      Set<Range> results;
      foreach (rangeToTest in setofRangesToTest)
      do
        if (rangeToTest.end <range.start) continue; // skip this one, its below our range
        if (rangeToTest.start >range.end) continue; // skip this one, its above our range
        results.add(rangeToTest);
      done
      return results;
    }
    
    0 讨论(0)
  • 2020-12-10 06:32

    This heavily depends on your ranges. A range can be big or small, and clustered or not clustered. If you have large, clustered ranges (think of "all positive 32-bit integers that can be divided by 2), the simple approach with Range(lower, upper) will not succeed.

    I guess I can say the following: if you have little ranges (clustering or not clustering does not matter here), consider bitvectors. These little critters are blazing fast with respect to union, intersection and membership testing, even though iteration over all elements might take a while, depending on the size. Furthermore, because they just use a single bit for each element, they are pretty small, unless you throw huge ranges at them.

    if you have fewer, larger ranges, then a class Range as describe by otherswill suffices. This class has the attributes lower and upper and intersection(a,b) is basically b.upper < a.lower or a.upper > b.lower. Union and intersection can be implemented in constant time for single ranges and for compisite ranges, the time grows with the number of sub-ranges (thus you do not want not too many little ranges)

    If you have a huge space where your numbers can be, and the ranges are distributed in a nasty fasion, you should take a look at binary decision diagrams (BDDs). These nifty diagrams have two terminal nodes, True and False and decision nodes for each bit of the input. A decision node has a bit it looks at and two following graph nodes -- one for "bit is one" and one for "bit is zero". Given these conditions, you can encode large ranges in tiny space. All positive integers for arbitrarily large numbers can be encoded in 3 nodes in the graph -- basically a single decision node for the least significant bit which goes to False on 1 and to True on 0.

    Intersection and Union are pretty elegant recursive algorithms, for example, the intersection basically takes two corresponding nodes in each BDD, traverse the 1-edge until some result pops up and checks: if one of the results is the False-Terminal, create a 1-branch to the False-terminal in the result BDD. If both are the True-Terminal, create a 1-branch to the True-terminal in the result BDD. If it is something else, create a 1-branch to this something-else in the result BDD. After that, some minimization kicks in (if the 0- and the 1-branch of a node go to the same following BDD / terminal, remove it and pull the incoming transitions to the target) and you are golden. We even went further than that, we worked on simulating addition of sets of integers on BDDs in order to enhance value prediction in order to optimize conditions.

    These considerations imply that your operations are bounded by the amount of bits in your number range, that is, by log_2(MAX_NUMBER). Just think of it, you can intersect arbitrary sets of 64-bit-integers in almost constant time.

    More information can be for example in the Wikipedia and the referenced papers.

    Further, if false positives are bearable and you need an existence check only, you can look at Bloom filters. Bloom filters use a vector of hashes in order to check if an element is contained in the represented set. Intersection and Union is constant time. The major problem here is that you get an increasing false-positive rate if you fill up the bloom-filter too much. Information, again, in the Wikipedia, for example.

    Hach, set representation is a fun field. :)

    0 讨论(0)
提交回复
热议问题