The common interview problem of determining the missing value in a range from 1 to N has been done a thousand times over. Variations include 2 missing values up to K missing val
I already answered it HERE
You can also create an array of boolean of the size last_element_in_the_existing_array + 1
.
In a for
loop mark all the element true
that are present in the existing array.
In another for
loop print the index of the elements which contains false
AKA The missing ones.
Time Complexity: O(last_element_in_the_existing_array)
Space Complexity: O(array.length)
Whether the given solution is theoretically better than the sorting one depends on N and K. While your solution has complexity of O(N*log(N))
, the given solution is O(N*K)
. I think that the given solution is (same as the sorting solution) able to solve any range [A, B]
just by transforming the range [A, B]
to [1, N]
.
If there are total N elements where each number x is such that 1 <= x <= N then we can solve this in O(nlogn) time complexity and O(1) space complexity.
Time complexity is O(nlogn)+O(n) almost equal to O(nlogn) when N > 100.
What about this?
What's left in your set are the missing numbers.
Because the numbers are taken from a small, finite range, they can be 'sorted' in linear time.
All we do is initialize an array of 100 booleans, and for each input, set the boolean corresponding to each number in the input, and then step through reporting the unset booleans.
My question is that seeing as the [...] cases converge at roughly something larger than O(nlogn) [...]
In 2011 (after you posted this question) Caf
posted a simple answer that solves the problem in O(n)
time and O(k)
space [where the array size is n - k
].
Importantly, unlike in other solutions, Caf's answer has no hidden memory requirements (using bit array's, adding numbers to elements, multiplying elements by -1
- these would all require O(log(n))
space).
Note: The question here (and the original question) didn't ask about the streaming version of the problem, and the answer here doesn't handle that case.
Regarding the other answers: I agree that many of the proposed "solutions" to this problem have dubious complexity claims, and if their time complexities aren't better in some way than either:
O(n)
time and space)O(n*log(n))
time, O(1)
space)...then you may as well just solve the problem by sorting.
However, we can get better complexities (and more importantly, genuinely faster solutions):