The trick is to think of the items as a set instead of a list. This allows you to identify items that are at the start or end of contiguous ranges, because a set lets you check if item-1 or item+1 is present. With that, you can solve the problem in linear time and space.
Pseudo-Code:
- Enumerate the items in the set, looking for ones that are at the start of a range (x starts a range when x-1 is not in the set).
- For each value that is the start of a range, scan upwards until you find the corresponding end of range value (x ends a range when x+1 is not in the set). This gives you all the relevant contiguous ranges.
- Return the contiguous range whose end was furthest from its start.
C# Code:
static Tuple<int, int> FindLargestContiguousRange(this IEnumerable<int> items) {
var itemSet = new HashSet<int>(items);
// find contiguous ranges by identifying their starts and scanning for ends
var ranges = from item in itemSet
// is the item at the start of a contiguous range?
where !itemSet.Contains(item-1)
// find the end by scanning upward as long as we stay in the set
let end = Enumerable.Range(item, itemSet.Count)
.TakeWhile(itemSet.Contains)
.Last()
// represent the contiguous range as a tuple
select Tuple.Create(item, end);
// return the widest contiguous range that was found
return ranges.MaxBy(e => e.Item2 - e.Item1);
}
note: MaxBy is from MoreLinq
Testing
Small sanity check:
new[] {3,6,4,1,8,5}.FindLargestContiguousRange().Dump();
// prints (3, 6)
Big contiguous list:
var zeroToTenMillion = Enumerable.Range(0, (int)Math.Pow(10, 7)+1);
zeroToTenMillion.FindLargestContiguousRange().Dump();
// prints (0, 10000000) after ~1 seconds
Big fragmented list:
var tenMillionEvens = Enumerable.Range(0, (int)Math.Pow(10, 7)).Select(e => e*2);
var evensWithAFewOdds = tenMillionEvens.Concat(new[] {501, 503, 505});
evensWithAFewOdds.FindLargestContiguousRange().Dump();
// prints (500, 506) after ~3 seconds
Complexity
This algorithm requires O(N) time and and O(N) space, where N is the number of items in the list, assuming the set operations are constant time.
Note that if the set was given as an input, instead of being built by the algorithm, we would only need O(1) space.
(Some comments say this is quadratic time. I think they assumed all items, instead of just items at the starts of ranges, triggered scans. That would indeed be quadratic, if the algorithm worked that way.)