Counting according to query

后端未结

关注

 4  655

Given an array of N positive elements. Lets suppose we list all N × (N+1) / 2 non-empty continuous subarrays of the array A and then replaced all the subarrays with the maxi

Creating the segment tree: O(n)

In practice, what it does is to take an array, let's say:

A = [5,1,7,2,3,7,3,1]

And construct an array-backed tree that looks like this:

In the tree, the first number is the value and the second is the index where it appears in the array. Each node is the maximum of its two children. This tree is backed by an array (pretty much like a heap tree) where the children of the index i are in the indexes i*2+1 and i*2+2.

Then, for each element, it becomes easy to find the nearest greater elements (before and after each element).

To find the nearest greater element to the left, we go up in the tree searching for the first parent where the left node has value greater and the index lesser than the argument. The answer must be a child of this parent, then we go down in the tree looking for the rightmost node that satisfies the same condition.

Similarly, to find the nearest greater element to the right, we do the same, but looking for a right node with an index greater than the argument. And when going down, we look for the leftmost node that satisfies the condition.

Creating the cumulative frequency array: O(n log n)

From this structure, we can compute the frequency array, that tells how many times each element appears as maximum in the subarray list. We just have to count how many lesser elements are on the left and on the right of each element and multiply those values. For the example array ([1, 2, 3]), this would be:

[(1, 1), (2, 2), (3, 3)]

This means that 1 appears only once as maximum, 2 appears twice, etc.

But we need to answer range queries, so it's better to have a cumulative version of this array, that would look like:

[(1, 1), (2, 3), (3, 6)]

The (3, 6) means, for example, that there are 6 subarrays with maxima less than or equal to 3.

Answering `q` queries: O(q log n)

Then, to answer each query, you just have to make binary searches to find the value you want. For example. If you need to find the exact number of 3, you may want to do: query(F, 3) - query(F, 2). If you want to find those lesser than 3, you do: query(F, 2). If you want to find those greater than 3: query(F, float('inf')) - query(F, 3).

Implementation

I've implemented it in Python and it seems to work well.

import sys, random, bisect
from collections import defaultdict
from math import log, ceil

def make_tree(A):
    n = 2**(int(ceil(log(len(A), 2))))
    T = [(None, None)]*(2*n-1)

    for i, x in enumerate(A):
        T[n-1+i] = (x, i)

    for i in reversed(xrange(n-1)):
        T[i] = max(T[i*2+1], T[i*2+2])

    return T

def print_tree(T):
    print 'digraph {'
    for i, x in enumerate(T):
        print '    ' + str(i) + '[label="' + str(x) + '"]'
        if i*2+2 < len(T):
            print '    ' + str(i)+ '->'+ str(i*2+1)
            print '    ' + str(i)+ '->'+ str(i*2+2)

    print '}'

def find_generic(T, i, fallback, check, first, second):
    j = len(T)/2+i
    original = T[j]
    j = (j-1)/2

    #go up in the tree searching for a value that satisfies check
    while j > 0 and not check(T[second(j)], original):
        j = (j-1)/2

    #go down in the tree searching for the left/rightmost node that satisfies check
    while j*2+1<len(T):
        if check(T[first(j)], original):
            j = first(j)
        elif check(T[second(j)], original):
            j = second(j)
        else:
            return fallback

    return j-len(T)/2


def find_left(T, i, fallback):
    return find_generic(T, i, fallback, 
        lambda a, b: a[0]>b[0] and a[1]<b[1],  #value greater, index before
        lambda j: j*2+2,                       #rightmost first
        lambda j: j*2+1                        #leftmost second
    ) 


def find_right(T, i, fallback):
    return find_generic(T, i, fallback,
        lambda a, b: a[0]>=b[0] and a[1]>b[1], #value greater or equal, index after
        lambda j: j*2+1,                       #leftmost first
        lambda j: j*2+2                        #rightmost second
    )       

def make_frequency_array(A):
    T = make_tree(A)

    D = defaultdict(lambda: 0)
    for i, x in enumerate(A):
        left = find_left(T, i, -1)
        right = find_right(T, i, len(A))
        D[x] += (i-left) * (right-i)

    F = sorted(D.items())
    for i in range(1, len(F)):
        F[i] = (F[i][0], F[i-1][1] + F[i][1])

    return F

def query(F, n):
    idx = bisect.bisect(F, (n,))
    if idx>=len(F): return F[-1][1]
    if F[idx][0]!=n: return 0
    return F[idx][1]

F = make_frequency_array([1,2,3])
print query(F, 3)-query(F, 2) #3 3
print query(F, float('inf'))-query(F, 4) #1 4
print query(F, float('inf'))-query(F, 1) #1 1
print query(F, 2) #2 3

0 讨论(0)

栀梦

2021-01-04 18:45
You problem can be divided into several steps:
1. For each element of initial array calculate the number of "subarrays" where current element is maximum. This will involve a bit of combinatorics. First you need for each element to know index of previous and next element that is bigger than current element. Then calculate the number of subarrays as (i - i_prev) * (i_next - i). Finding i_prev and i_next requires two traversals of the initial array: in forward and backward order. For i_prev you need to traverse you array left to right. During the traversal maintain the BST that contains the biggest of the previous elements along with their index. For each element of original array, find the minimal element in BST that is bigger than current. It's index, stored as value, will be i_prev. Then remove from BST all elements that are smaller that current. This operation should be O(logN), as you are removing whole subtrees. This step is required, as current element you are about to add will "override" all element that are less than it. Then add current element to BST with it's index as value. At each point of time, BST will store the descending subsequence of previous elements where each element is bigger than all it's predecessors in array (for previous elements {1,2,44,5,2,6,26,6} BST will store {44,26,6}). The backward traversal to find i_next is similar.
2. After previous step you'll have pairs K→P where K is the value of some element from the initial array and P is the number of subarrays where this element is maxumum. Now you need to group this pairs by K. This means calculating sum of P values of the equal K elements. Be careful about the corner cases when two elements could have share the same subarrays.
3. As Ritesh suggested: Put all grouped K→P into an array, sort it by K and calculate cumulative sum of P for each element in one pass. It this case your queries will be binary searches in this sorted array. Each query will be performed in O(log(N)) time.
0 讨论(0)
发布评论:

提交评论
- 加载中...
滥情空心

2021-01-04 18:47
I believe I have a solution in O(N + Q*log N) (More about time complexity). The trick is to do a lot of preparation with your array before even the first query arrives.
1. For each number, figure out where is the first number on left / right of this number that is strictly bigger.
Example: for array: 1, 8, 2, 3, 3, 5, 1 both 3's left block would be position of 8, right block would be the position of 5.

This can be determined in linear time. This is how: Keep a stack of previous maximums in a stack. If a new maximum appears, remove maximums from the stack until you get to a element bigger than or equal to the current one. Illustration:

In this example, in the stack is: [15, 13, 11, 10, 7, 3] (you will of course keep the indexes, not the values, I will just use value for better readability).

Now we read 8, 8 >= 3 so we remove 3 from stack and repeat. 8 >= 7, remove 7. 8 < 10, so we stop removing. We set 10 as 8's left block, and add 8 to the maximums stack.

Also, whenever you remove from the stack (3 and 7 in this example), set the right block of removed number to the current number. One problem though: right block would be set to the next number bigger or equal, not strictly bigger. You can fix this with simply checking and relinking right blocks.
1. Compute what number is how many times a maximum of some subsequence.
Since for each number you now know where is the next left / right bigger number, I trust you with finding appropriate math formula for this.

Then, store the results in a hashmap, key would be a value of a number, and value would be how many times is that number a maximum of some subsequence. For example, record [4->12] would mean that number 4 is the maximum in 12 subsequences.

Lastly, extract all key-value pairs from the hashmap into an array, and sort that array by the keys. Finally, create a prefix sum for the values of that sorted array.
1. Handle a request
For request "exactly k", just binary search in your array, for more/less thank``, binary search for key k and then use the prefix array.
0 讨论(0)
发布评论:

提交评论
- 加载中...

Counting according to query

Creating the segment tree: O(n)

Creating the cumulative frequency array: O(n log n)

Answering q queries: O(q log n)

Implementation

Answering `q` queries: O(q log n)