What are practical uses for STL's 'partial_sum'?

偶尔善良 提交于 2019-12-03 05:00:40

I used it to reduce memory usage of a simple mark-sweep garbage collector in my toy lambda calculus interpreter.

The GC pool is an array of objects of identical size. The goal is to eliminate objects that aren't linked to other objects, and condense the remaining objects into the beginning of the array. Since the objects are moved in memory, each link needs to be updated. This necessitates an object remapping table.

partial_sum allows the table to be stored in compressed format (as little as one bit per object) until the sweep is complete and memory has been freed. Since the objects are small, this significantly reduces memory use.

  1. Recursively mark used objects and populate the Boolean array.
  2. Use remove_if to condense the marked objects to the beginning of the pool.
  3. Use partial_sum over the Boolean values to generate a table of pointers/indexes into the new pool.
    • This works because the Nth marked object has N preceding 1's in the array and acquires pool index N.
  4. Sweep over the pool again and replace each link using the remap table.

It's especially friendly to the data cache to put the remap table in the just-freed, thus still hot, memory.

One thing to note about partial sum is that it is the operation that undoes adjacent difference much like - undoes +. Or better yet if you remember calculus the way differentiation undoes integration. Better because adjacent difference is essentially differentiation and partial sum is integration.

Let's say you have simulation of a car and at each time step you need to know the position, velocity, and acceleration. You only need to store one of those values as you can compute the other two. Say you store the position at each time step you can take the adjacent difference of the position to give the velocity and the adjacent difference of the velocity to give the acceleration. Alternatively, if you store the acceleration you can take the partial sum to give the velocity and the partial sum of the velocity gives the position.

Partial sum is one of those functions that doesn't come up too often for most people but is enormously useful when you find the right situation. A lot like calculus.

Last time I (would have) used it is when converting a discrete probability distribution (an array of p(X = k)) into a cumulative distribution (an array of p(X <= k)). To select once from the distribution, you can pick a number from [0-1) randomly, then binary search into the cumulative distribution.

That code wasn't in C++, though, so I did the partial sum myself.

You can use it to generate a monotonically increasing sequence of numbers. For example, the following generates a vector containing the numbers 1 through 42:

std::vector<int> v(42, 1);
std::partial_sum(v.begin(), v.end(), v.begin());

Is this an everyday use case? Probably not, though I've found it useful on several occasions.

You can also use std::partial_sum to generate a list of factorials. (This is even less useful, though, since the number of factorials that can be represented by a typical integer data type is quite limited. It is fun, though :-D)

std::vector<int> v(10, 1);
std::partial_sum(v.begin(), v.end(), v.begin());
std::partial_sum(v.begin(), v.end(), v.begin(), std::multiplies<int>());
sascha

Personal Use Case: Roulette-Wheel-Selection

I'm using partial_sum in a roulette-wheel-selection algorithm (link text). This algorithm choses randomly elements from a container with a probability which is linear to some value given beforehands.

Because all my elements to choose from bringing a not-necessarily normalized value, I use the partial_sum algorithm for constructing something like a "roulette-wheel", because I sum up all the elements. Then I chose a random variable in this range (the last partial_sum is the sum of all) and use stl::lower_bound for searching "the wheel" where my random search landed. The element returned by the lower_bound algorithm is the chosen one.

Besides the advantage of clear and expressive code with the use of partial_sum, I could also gain some speed when experimenting with the GCC parallel mode which brings parallelized versions for some algorithms and one of them is the partial_sum (link text).

Another use I know of: One of the most important algorithmic primitives in parallel processing (but maybe a little bit away from STL)

If you're interested in heavy optimized algorithms which are using partial_sum (in this case maybe more results under the synonyms "scan" or "prefix_sum"), than go to the parallel algorithms community. They need it all the time. You won't find a parallel sorting algorithm based on quicksort or mergesort without using it. This operation is one of the most important parallel primitives used. I think it is most commonly used for calculating offsets in dynamic algorithms. Think of a partition step in quicksort, which is split and fed to the parallel threads. You don't know the number of elements in each slot of the partition before calculating it. So you need some offsets for all the threads for later access.

Maybe you will find more informatin in the now-hot topic of GPU processing. One short article regarding Nvidia's CUDA and the scan-primitive with a few application examples you will find in Chapter 39. Parallel Prefix Sum (Scan) with CUDA.

You could build a "moving sum" (precursor to a moving average):

template <class T>
void moving_sum (const vector<T>& in, int num, vector<T>& out)
{
    // cummulative sum
    partial_sum (in.begin(), in.end(), out.begin());

    // shift and subtract
    int j;
    for (int i = out.size() - 1; i >= 0; i--) {
        j = i - num;
        if (j >= 0)
            out[i] -= out[j];
    }    
}

And then call it with:

vector<double> v(10);
// fill in v
vector<double> v2 (v.size());
moving_sum (v, 3, v2);

You know, I actually did use partial_sum() once... It was this interesting little problem that I was asked on a job interview. I enjoyed it so much, I went home and coded it up.

The problem was: Given a sequential sequence of integers, find the shortest sub-sequence with the highest value. E.g. Given:

Value: -1  2  3 -1  4 -2 -4  5
Index:  0  1  2  3  4  5  6  7

We would find the subsequence [1,4]

Now the obvious solution is to run with 3 for loops, iterating over all possible starts & ends, and adding up the value of each possible subsequence in turn. Inefficient, but quick to code up and hard to make mistakes. (Especially when the third for loop is just accumulate(start,end,0).)

The correct solution involves a divide-and-conquer / bottom up approach. E.g. Divide the problem space in half, and for each half compute the largest subsequence contained within that section, the largest subsequence including the starting number, the largest subsequence including the ending number, and the entire section's subsequence. Armed with this data we can then combine the two halves together without any further evaluation of either one. Obviously the data for each half can be computed by further breaking each half into halves (quarters), each quarter into halves (eighths), and so on until we have trivial singleton cases. It's all quite efficient.

But all that aside, there's a third (somewhat less efficient) option that I wanted to explore. It's similar to the 3-for-loop case, only we add the adjacent numbers to avoid so much work. The idea is that there's no need to add a+b, a+b+c, and a+b+c+d when we can add t1=a+b, t2=t1+c, and t3=t2+d. It's a space/computation tradeoff thing. It works by transforming the sequence:

Index: 0 1 2  3  4
FROM:  1 2 3  4  5
  TO:  1 3 6 10 15

Thereby giving us all possible substrings starting at index=0 and ending at indexes=0,1,2,3,4.

Then we iterate over this set subtracting the successive possible "start" points...

FROM:  1 3 6 10 15
  TO:  - 2 5  9 14
  TO:  - - 3  7 12
  TO:  - - -  4  9
  TO:  - - -  -  5

Thereby giving us the values (sums) of all possible subsequences.

We can find the maximum value of each iteration via max_element().

The first step is most easily accomplished via partial_sum().

The remaining steps via a for loop and transform(data+i,data+size,data+i,bind2nd(minus<TYPE>(),data[i-1])).

Clearly O(N^2). But still interesting and fun...

Partial sums are often useful in parallel algorithms. Consider the code

for (int i=0; N>i; ++i) {
  sum += x[i];
  do_something(sum);
}

If you want to parallelise this code, you need to know the partial sums. I am using GNUs parallel version of partial_sum for something very similar.

kirill_igum

I often use partial sum not to sum but to calculate the current value in the sequence depending on the previous.

For example, if you integrate a function. Each new step is a previous step, vt += dvdt or vt = integrate_step(dvdt, t_prev, t_prev+dt);.

Personal Use Case: intermediate step in counting sort from CLRS:

COUNTING_SORT (A, B, k)

for i ← 1 to k do
    c[i] ← 0
for j ← 1 to n do
    c[A[j]] ← c[A[j]] + 1
//c[i] now contains the number of elements equal to i


// std::partial_sum here
for i ← 2 to k do
    c[i] ← c[i] + c[i-1]


// c[i] now contains the number of elements ≤ i
for j ← n downto 1 do
    B[c[A[i]]] ← A[j]
    c[A[i]] ← c[A[j]] - 1
Anne van Rossum

In nonparametric Bayesian methods there is a Metropolis-Hastings step (per observation) that determines to sample a new or an existing cluster. If an existing cluster has to be sampled this needs to be done with different weights. These weighted likelihoods are simulated in the following example code.

#include <random>                                                                                                       
#include <iostream>                                                                                                     
#include <algorithm>                                                                                                    

int main() {                                                                                                            

    std::default_random_engine generator(std::random_device{}());                                                       
    std::uniform_real_distribution<double> distribution(0.0,1.0);                                                       

    int K = 8;                                                                                                          

    std::vector<double> weighted_likelihood(K);                                                                         
    for (int i = 0; i < K; ++i) {                                                                                       
        weighted_likelihood[i] = i*10;                                                                                  
    }                                                                                                                   
    std::cout << "Weighted likelihood: ";                                                                               
    for (auto i: weighted_likelihood) std::cout << i << ' ';                                                            
    std::cout << std::endl;                                                                                             

    std::vector<double> cumsum_likelihood(K);                                                                           
    std::partial_sum(weighted_likelihood.begin(), weighted_likelihood.end(), cumsum_likelihood.begin());                

    std::cout << "Cumulative sum of weighted likelihood: ";                                                             
    for (auto i: cumsum_likelihood) std::cout << i << ' ';                                                              
    std::cout << std::endl;                                                                                             

    std::vector<int> frequency(K);                                                                                      

    int N = 280000;                                                                                                     
    for (int i = 0; i < N; ++i) {                                                                                       
        double pick = distribution(generator) * cumsum_likelihood.back();                                               

        auto lower = std::lower_bound(cumsum_likelihood.begin(), cumsum_likelihood.end(), pick);                        
        int index = std::distance(cumsum_likelihood.begin(), lower);                                                    
        frequency[index]++;                                                                                             
    }                                                                                                                   

    std::cout << "Frequencies: ";                                                                                       
    for (auto i: frequency) std::cout << i << ' ';                                                                      
    std::cout << std::endl;                                                                                             

}

Note that this is not different from the answer by https://stackoverflow.com/users/13005/steve-jessop. It's added to give a bit more context about a particular situation (nonparametric Bayesian mehods, e.g. the algorithms by Neal using the Dirichlet process as prior) and the actual code which uses partial_sum in combination with lower_bound.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!