Given an array of integers and some query operations.
The query operations are of 2 types
1.Update the value of the ith index to x.
2.Given 2 integers find the kth m
Here is a O(polylog n)
per query solution that does actually not assume a constant k
, so the k
can vary between queries. The main idea is to use a segment tree, where every node represents an interval of array indices and contains a multiset (balanced binary search tree) of the values in the represened array segment. The update operation is pretty straightforward:
O(log^2 n)
We notice that every array element will be in O(log n)
multisets, so the total space usage is O(n log n)
. With linear-time merging of multisets we can build the initial segment tree in O(n log n)
as well (there's O(n)
work per level).
What about queries? We are given a range [i, j]
and a rank k
and want to find the k-th smallest element in a[i..j]
. How do we do that?
O(log n)
disjoint nodes, the union of whose multisets is exactly the multiset of values in the query range. Let's call those multisets s_1, ..., s_m
(with m <= ceil(log_2 n)
). Finding the s_i
takes O(log n)
time.s_1, ..., s_m
. See below.So how does the selection algorithm work? There is one really simple algorithm to do this.
We have s_1, ..., s_n
and k
given and want to find the smallest x
in a
, such that s_1.rank(x) + ... + s_m.rank(x) >= k - 1
, where rank
returns the number of elements smaller than x
in the respective BBST (this can be implemented in O(log n)
if we store subtree sizes).
Let's just use binary search to find x
! We walk through the BBST of the root, do a couple of rank queries and check whether their sum is larger than or equal to k
. It's a predicate monotone in x
, so binary search works. The answer is then the minimum of the successors of x
in any of the s_i
.
Complexity: O(n log n)
preprocessing and O(log^3 n)
per query.
So in total we get a runtime of O(n log n + q log^3 n)
for q
queries. I'm sure we could get it down to O(q log^2 n)
with a cleverer selection algorithm.
UPDATE: If we are looking for an offline algorithm that can process all queries at once, we can get O((n + q) * log n * log (q + n))
using the following algorithm:
q + n
.m
. If k <= m
, recurse into the left child. Otherwise recurse into the right child, with k
decremented by m
.O(log (q + n))
nodes that cover the old value and insert it into the nodes that cover the new value.The advantage of this approach is that we don't need subtree sizes, so we can implement this with most standard library implementations of balanced binary search trees (e.g. set
in C++).
We can turn this into an online algorithm by changing the segment tree out for a weight-balanced tree such as a BB[α] tree. It has logarithmic operations like other balanced binary search trees, but allows us to rebuild an entire subtree from scratch when it becomes unbalanced by charging the rebuilding cost to the operations that must have caused the imbalance.