SQLITE moving average

。_饼干妹妹 提交于 2020-01-17 07:36:48

问题


Trying to get the moving average on a per X months basis using SQLite. Problem is i cannot seem to figure or find anything remotely useful on how to aggregate with a X month(s) lookback period and thus create a moving average.

Table
CREATE TABLE "nav" (
    `id`    TEXT,
    `nav`   NUMERIC,
    `date`  TEXT
);

Sample data

id      nav     date
1380    15.3    2005-01-09
1380    15.4    2005-01-16
1380    15.5    2005-01-23
1380    15.55   2005-01-30
1380    15.66   2005-02-06
1380    15.45   2005-02-13
1380    15.26   2005-02-20
1380    15.14   2005-02-27
1380    14.95   2005-03-06
1380    15.24   2005-03-13
1380    15.6    2005-03-20
1380    15.7    2005-03-27
1380    15.86   2005-04-03
1380    15.78   2005-04-10
1380    15.58   2005-04-17
1380    15.84   2005-04-24
1380    15.88   2005-05-01
1380    16.2    2005-05-08
1380    16.29   2005-05-15
1380    16.32   2005-05-22
1380    16.59   2005-05-29
1380    16.96   2005-06-05
1380    16.76   2005-06-12
1380    17.32   2005-06-19
1380    17.63   2005-06-26
1380    17.6    2005-07-03
1380    17.36   2005-07-10
1380    17.36   2005-07-17
1380    17.27   2005-07-24
1380    16.86   2005-07-31
1380    16.68   2005-08-07
1380    17.21   2005-08-14
1380    16.87   2005-08-21
1380    16.52   2005-08-28
1380    16.77   2005-09-04
1380    17.1    2005-09-11
1380    17.37   2005-09-18
1380    17.29   2005-09-25
1380    17.24   2005-10-02
1380    17.52   2005-10-09
1380    17.72   2005-10-16
1380    17.65   2005-10-23
1380    18.18   2005-10-30
1380    18.31   2005-11-06
1380    18.42   2005-11-13
1380    18.13   2005-11-20
1380    18.12   2005-11-27
1380    17.92   2005-12-04
1380    17.74   2005-12-11
1380    17.93   2005-12-18
1380    17.94   2005-12-25
1380    17.39   2006-01-01
1380    17.4    2006-01-08
1380    17.37   2006-01-15
1380    17.22   2006-01-22
1380    17.47   2006-01-29
1380    17.53   2006-02-05
1380    17.77   2006-02-12
1380    17.95   2006-02-19
1380    17.85   2006-02-26
1380    17.88   2006-03-05
1380    17.42   2006-03-12
1380    17.71   2006-03-19
1380    17.71   2006-03-26
1380    17.53   2006-04-02
1380    17.54   2006-04-09
1380    17.21   2006-04-16
1380    16.84   2006-04-23
1380    16.68   2006-04-30
1380    16.59   2006-05-07
1380    16.82   2006-05-14
1380    16.72   2006-05-21
1380    16.34   2006-05-28
1380    16.72   2006-06-04
1380    16.83   2006-06-11
1380    16.84   2006-06-18
1380    16.54   2006-06-25
1380    16.41   2006-07-02
1380    16.76   2006-07-09
1380    16.77   2006-07-16
1380    16.69   2006-07-23
1380    16.48   2006-07-30
1380    16.65   2006-08-06
1380    16.61   2006-08-13
1380    16.78   2006-08-20
1380    16.88   2006-08-27
1380    17.03   2006-09-03
1380    16.9    2006-09-10
1380    16.85   2006-09-17
1380    17.04   2006-09-24
1380    17.1    2006-10-01
1380    17.28   2006-10-08
1380    17.02   2006-10-15
1380    16.91   2006-10-22
1380    16.85   2006-10-29
1380    16.53   2006-11-05
1380    16.53   2006-11-12
1380    16.17   2006-11-19
1380    15.87   2006-11-26
1380    15.94   2006-12-03
1380    16.22   2006-12-10
1380    16.07   2006-12-17
1380    16.09   2006-12-24
1380    16.42   2006-12-31
1380    16.54   2007-01-07
1380    16.57   2007-01-14
1380    16.61   2007-01-21
1380    16.5    2007-01-28
1380    16.53   2007-02-04
1380    16.67   2007-02-11
1380    16.73   2007-02-18
1380    16.71   2007-02-25
1380    16.81   2007-03-04
1380    16.54   2007-03-11
1380    16.59   2007-03-18
1380    16.67   2007-03-25
1380    16.42   2007-04-01
1380    16.35   2007-04-08
1380    16.11   2007-04-15
1380    15.96   2007-04-22
1380    16.1    2007-04-29
1380    16.3    2007-05-06
1380    16.33   2007-05-13
1380    16.35   2007-05-20
1380    16.61   2007-05-27
1380    16.62   2007-06-03
1380    16.92   2007-06-10
1380    16.57   2007-06-17
1380    16.43   2007-06-24
1380    16.17   2007-07-01
1380    15.95   2007-07-08
1380    15.97   2007-07-15
1380    16.23   2007-07-22
1380    16.17   2007-07-29
1380    16.36   2007-08-05
1380    16.68   2007-08-12
1380    16.61   2007-08-19
1380    16.57   2007-08-26
1380    16.4    2007-09-02
1380    16.16   2007-09-09
1380    15.83   2007-09-16
1380    15.71   2007-09-23
1380    15.73   2007-09-30
1380    15.6    2007-10-07
1380    15.61   2007-10-14
1380    15.55   2007-10-21
1380    15.57   2007-10-28
1380    15.44   2007-11-04
1380    15.47   2007-11-11
1380    15.4    2007-11-18
1380    15.52   2007-11-25
1380    15.7    2007-12-02
1380    15.96   2007-12-09
1380    16.1    2007-12-16
1380    15.78   2007-12-23
1380    15.55   2007-12-30
1380    15.6    2008-01-06
1380    15.82   2008-01-13
1380    15.83   2008-01-20
1380    15.68   2008-01-27
1380    16.01   2008-02-03
1380    15.61   2008-02-10
1380    15.46   2008-02-17
1380    15.2    2008-02-24
1380    15.1    2008-03-02
1380    14.89   2008-03-09
1380    15.02   2008-03-16
1380    14.69   2008-03-23
1380    14.69   2008-03-30
1380    14.67   2008-04-06
1380    14.75   2008-04-13
1380    14.75   2008-04-20
1380    15      2008-04-27
1380    14.88   2008-05-04
1380    14.87   2008-05-11
1380    14.6    2008-05-18
1380    14.88   2008-05-25
1380    15      2008-06-01
1380    15.07   2008-06-08
1380    15      2008-06-15
1380    14.81   2008-06-22
1380    14.84   2008-06-29
1380    14.74   2008-07-06
1380    14.82   2008-07-13
1380    14.98   2008-07-20
1380    15.06   2008-07-27
1380    15.52   2008-08-03
1380    15.64   2008-08-10
1380    15.69   2008-08-17
1380    15.95   2008-08-24
1380    16.44   2008-08-31
1380    16.77   2008-09-07
1380    16.48   2008-09-14
1380    16.44   2008-09-21
1380    17.49   2008-09-28
1380    17.66   2008-10-05
1380    18.44   2008-10-12
1380    19.42   2008-10-19
1380    19.34   2008-10-26
1380    19.62   2008-11-02
1380    19.57   2008-11-09
1380    20.77   2008-11-16
1380    20.18   2008-11-23
1380    20.97   2008-11-30
1380    19.96   2008-12-07
1380    19.4    2008-12-14
1380    19.6    2008-12-21
1380    19.46   2008-12-28
1380    19.75   2009-01-04
1380    20.31   2009-01-11
1380    20.78   2009-01-18
1380    20.56   2009-01-25

Basically i would want... (fake data)

id      nav     date            average_6m
1380    15.6    2008-01         14.1
1380    15.61   2008-02         14.12
1380    14.69   2008-03         14.2
1380    15      2008-04         14.5
1380    14.88   2008-05         14.4
1380    14.84   2008-06         14.3
1380    15.06   2008-07         14.6
1380    16.44   2008-08         15.2
1380    17.49   2008-09         15.9
1380    19.34   2008-10         16.4
1380    20.97   2008-11         18.4
1380    19.46   2008-12         18.2
1380    20.56   2009-01         18.4

Challenges

  1. date isn't on a per-day structure, some intervalls even doesn't have all the work days in each month so have to use MAX() day on each month in order to get last value etc
  2. If there is no X month lookback the average must be/must get null

回答1:


First, filter out any rows that are not the last in the month. Then use a scalar subquery to compute the moving average; the condition with the m3 subquery ensures that the AVG() subquery does not find any rows if there are not exactly six.

WITH months(id, nav, date) AS (
  SELECT id, nav, MAX(date)
  FROM nav
  GROUP BY strftime('%Y-%m', date)
)
SELECT id,
       nav,
       strftime('%Y-%m', date),
       (SELECT AVG(nav)
        FROM months AS m2
        WHERE m2.date BETWEEN date(months.date, 'start of month', '-5 months')
                          AND months.date
          AND (SELECT COUNT(*)
               FROM months AS m3
               WHERE m3.date BETWEEN date(months.date, 'start of month', '-5 months')
                                 AND months.date
              ) = 6
       ) AS average_6m
FROM months;



回答2:


The "correct" answer to this seems to be "don't".

Since i'm using sqlite + php i figured it was simpler and more perfomant to do this with PHP instead, so i created a simple helper function for it:

/**
*   Moving average calculations
*   @param $arr: array with princing data
*   @param $ma: moving average; no. of rows
*   @param $nav: key for pricing data
*/

function sma( $arr, $ma=6, $nav='nav' ){


    foreach( $arr as $key => $val ){

        $avg = 0;
        $average = 0;

        // check if look-back period, else set 0
        if( $key-$ma > -2 ) {

            // create lookback period
            $range = range(0,$ma-1);

            foreach($range as $r){
                $tmp = $key - $r;
                $avg = $avg + $arr[$tmp][$nav];
            }

            // round with 2 decimal point
            $average = round( ($avg/$ma) * 100 ) / 100;         

        }

        // add to current key
        $newkey = 'sma'.$ma;
        $arr[$key][$newkey] = $average;

    }


    return $arr;

} // sma()

Since we're using arrays in PHP the performance hit is negliable and probably less then it would be to calculate it using SQLite.

Screen of result:



来源:https://stackoverflow.com/questions/40698559/sqlite-moving-average

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!