Calculating moving average in C++

后端 未结 4 756
情深已故
情深已故 2021-02-05 15:43

I am trying to calculate the moving average of a signal. The signal value ( a double ) is updated at random times. I am looking for an efficient way to calculate it\'s time wei

相关标签:
4条回答
  • 2021-02-05 16:02

    The trick is the following: You get updates at random times via void update(int time, float value). However you also need to also track when an update falls off the time window, so you set an "alarm" which called at time + N which removes the previous update from being ever considered again in the computation.

    If this happens in real-time you can request the operating system to make a call to a method void drop_off_oldest_update(int time) to be called at time + N

    If this is a simulation, you cannot get help from the operating system and you need to do it manually. In a simulation you would call methods with the time supplied as an argument (which does not correlate with real time). However, a reasonable assumption is that the calls are guaranteed to be such that the time arguments are increasing. In this case you need to maintain a sorted list of alarm time values, and for each update and read call you check if the time argument is greater than the head of the alarm list. While it is greater you do the alarm related processing (drop off the oldest update), remove the head and check again until all alarms prior to the given time are processed. Then do the update call.

    I have so far assumed it is obvious what you would do for the actual computation, but I will elaborate just in case. I assume you have a method float read (int time) that you use to read the values. The goal is to make this call as efficient as possible. So you do not compute the moving average every time the read method is called. Instead you precompute the value as of the last update or the last alarm, and "tweak" this value by a couple of floating point operations to account for the passage of time since the last update. (i. e. a constant number of operations except for perhaps processing a list of piled up alarms).

    Hopefully this is clear -- this should be a quite simple algorithm and quite efficient.

    Further optimization: one of the remaining problems is if a large number of updates happen within the time window, then there is a long time for which there are neither reads nor updates, and then a read or update comes along. In this case, the above algorithm will be inefficient in incrementally updating the value for each of the updates that is falling off. This is not necessary because we only care about the last update beyond the time window so if there is a way to efficiently drop off all older updates, it would help.

    To do this, we can modify the algorithm to do a binary search of updates to find the most recent update before the time window. If there are relatively few updates that needs to be "dropped" then one can incrementally update the value for each dropped update. But if there are many updates that need to be dropped then one can recompute the value from scratch after dropping off the old updates.

    Appendix on Incremental Computation: I should clarify what I mean by incremental computation above in the sentence "tweak" this value by a couple of floating point operations to account for the passage of time since the last update. Initial non-incremental computation:

    start with

    sum = 0; 
    updates_in_window = /* set of all updates within window */; 
    prior_update' = /* most recent update prior to window with timestamp tweaked to window beginning */; 
    relevant_updates = /* union of prior_update' and updates_in_window */,  
    

    then iterate over relevant_updates in order of increasing time:

    for each update EXCEPT last { 
        sum += update.value * time_to_next_update; 
    },  
    

    and finally

    moving_average = (sum + last_update * time_since_last_update) / window_length;.

    Now if exactly one update falls off the window but no new updates arrive, adjust sum as:

    sum -= prior_update'.value * time_to_next_update + first_update_in_last_window.value * time_from_first_update_to_new_window_beginning;
    

    (note it is prior_update' which has its timestamp modified to start of last window beginning). And if exactly one update enters the window but no new updates fall off, adjust sum as:

    sum += previously_most_recent_update.value * corresponding_time_to_next_update. 
    

    As should be obvious, this is a rough sketch but hopefully it shows how you can maintain the average such that it is O(1) operations per update on an amortized basis. But note further optimization in previous paragraph. Also note stability issues alluded to in an older answer, which means that floating point errors may accumulate over a large number of such incremental operations such that there is a divergence from the result of the full computation that is significant to the application.

    0 讨论(0)
  • 2021-02-05 16:06

    If an approximation is OK and there's a minimum time between samples, you could try super-sampling. Have an array that represents evenly spaced time intervals that are shorter than the minimum, and at each time period store the latest sample that was received. The shorter the interval, the closer the average will be to the true value. The period should be no greater than half the minimum or there is a chance of missing a sample.

    0 讨论(0)
  • 2021-02-05 16:12

    Note: Apparently this is not the way to approach this. Leaving it here for reference on what is wrong with this approach. Check the comments.

    UPDATED - based on Oli's comment... not sure about the instability that he is talking about though.

    Use a sorted map of "arrival times" against values. Upon arrival of a value add the arrival time to the sorted map along with it's value and update the moving average.

    warning this is pseudo-code:

    SortedMapType< int, double > timeValueMap;
    
    void onArrival(double value)
    {
        timeValueMap.insert( (int)time(NULL), value);
    }
    
    //for example this runs every 10 seconds and the moving window is 120 seconds long
    void recalcRunningAverage()
    {
        // you know that the oldest thing in the list is 
        // going to be 129.9999 seconds old
        int expireTime = (int)time(NULL) - 120;
        int removeFromTotal = 0;
        MapIterType i;
        for( i = timeValueMap.begin();
        (i->first < expireTime || i != end) ; ++i )
        {
        }
    
        // NOW REMOVE PAIRS TO LEFT OF i
    
        // Below needs to apply your time-weighting to the remaining values
        runningTotal = calculateRunningTotal(timeValueMap); 
        average = runningTotal/timeValueMap.size();
    }
    

    There... Not fully fleshed out but you get the idea.

    Things to note: As I said the above is pseudo code. You'll need to choose an appropriate map. Don't remove the pairs as you iterate through as you will invalidate the iterator and will have to start again.
    See Oli's comment below also.

    0 讨论(0)
  • 2021-02-05 16:13
    #include <map>
    #include <iostream>
    
    // Sample - the type of a single sample
    // Date - the type of a time notation
    // DateDiff - the type of difference of two Dates    
    template <class Sample, class Date, class DateDiff = Date>
    class TWMA {
    private:
      typedef std::map<Date, Sample> qType;
      const DateDiff windowSize; // The time width of the sampling window
      qType samples; // A set of sample/date pairs
      Sample average; // The answer
    
    public:
    
      // windowSize - The time width of the sampling window
      TWMA(const DateDiff& windowSize) : windowSize(windowSize), average(0) {}
    
      // Call this each time you receive a sample
      void
      Update(const Sample& sample, const Date& now) {
        // First throw away all old data
        Date then(now - windowSize);
        samples.erase(samples.begin(), samples.upper_bound(then));
    
        // Next add new data
        samples[now] = sample;
    
        // Compute average: note: this could move to Average(), depending upon
        // precise user requirements.
        Sample sum = Sample();
        for(typename qType::iterator it = samples.begin();
            it != samples.end();
            ++it) {
          DateDiff duration(it->first - then);
          sum += duration * it->second;
          then = it->first;
        }
        average = sum / windowSize;
      }
    
      // Call this when you need the answer.
      const Sample& Average() { return average; }
    
    };
    
    int main () {
      TWMA<double, int> samples(10);
    
      samples.Update(1, 1);
      std::cout << samples.Average() << "\n"; // 1
      samples.Update(1, 2);
      std::cout << samples.Average() << "\n"; // 1
      samples.Update(1, 3);
      std::cout << samples.Average() << "\n"; // 1
      samples.Update(10, 20);
      std::cout << samples.Average() << "\n"; // 10
      samples.Update(0, 25);
      std::cout << samples.Average() << "\n"; // 5
      samples.Update(0, 30);
      std::cout << samples.Average() << "\n"; // 0
    }
    
    0 讨论(0)
提交回复
热议问题