SQL Query for 7 Day Rolling Average in SQL Server

╄→гoц情女王★ 提交于 2019-12-17 09:36:18

问题


I have a table of hourly product usage (how many times the product is used) data –

ID (bigint)| ProductId (tinyint)| Date (int - YYYYMMDD) | Hour (tinyint)| UsageCount (int)
#|1 | 20140901 | 0 | 10
#|1 | 20140901 | 1 | 15
#|1 | 20140902 | 5 | 25
#|1 | 20140903 | 5 | 25
#|1 | 20140904 | 3 | 25
#|1 | 20140905 | 7 | 25
#|1 | 20140906 | 10 | 25
#|1 | 20140907 | 9 | 25
#|1 | 20140908 | 5 | 25
#|2 | 20140903 | 16 | 10
#|2 | 20140903 | 13 | 115

Likewise, I have the usage data for 4 different products (ProductId from 1 through 4) stored for every hour in the product_usage table. As you can imagine, it is constantly growing as the nightly ETL process dumps the data for the entire previous day. If a product is not used on any hour of a day, the record for that hour won’t appear in this table. Similarly, if a product is not used for the entire day, there won’t be any record for that day in the table. I need to generate a report that gives daily usage and last 7 days’ rolling average –

For example:

ProductId | Date | DailyUsage | RollingAverage
1 | 20140901 | sum of usages of that day | (Sum of usages from 20140901 through 20140826) / 7
1 | 20140901 | sum of usages of that day | (Sum of usages from 20140901 through 20140826) / 7
1 | 20140902 | sum of usages of that day | (Sum of usages from 20140902 through 20140827) / 7
2 | 20140902 | sum of usages of that day | (Sum of usages from 20140902 through 20140827) / 7

And so on.. I am planning to create an Indexed View in SQL server 2014. Can you think of an efficient SQL query to do this?


回答1:


Try:

select x.*,
       avg(dailyusage) over(partition by productid order by productid, date rows between 6 preceding and current row) as rolling_avg
  from (select productid, date, sum(usagecount) as dailyusage
          from tbl
         group by productid, date) x

Fiddle:

http://sqlfiddle.com/#!6/f674a7/4/0

Replace "avg(dailusage) over...." with sum (rather than avg) if what you really want is the sum for the past week. In your title you say you want the average but later you say you want the sum. The query should be the same other than that, so use whichever you actually want.

As was pointed out by Gordon this is basically the average of the past 6 dates in which the product was used, which might be more than just the past 6 days if there are days without any rows for that product on the table because it wasn't used at all. To get around that you could use a date table and your products table.




回答2:


You have to be careful if you can be missing data on some days. If I assume that there is data for some product on each day, then this approach will work:

select p.productid, d.date, sum(usagecount),
       sum(sum(usagecount)) over (partition by p.productid order by d.date
                                  rows between 6 preceding and current row) as Sum7day
from (select distinct productid from hourly) p cross join
     (select distinct date from hourly) d left join
     hourly h
     on h.productid = p.productid and h.date = p.date
group by p.productid, d.date;


来源:https://stackoverflow.com/questions/25922379/sql-query-for-7-day-rolling-average-in-sql-server

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!