Distinct count on a rolling time window

て烟熏妆下的殇ゞ 提交于 2020-06-17 02:19:47

问题


I want to count the number of distinct catalog numbers that have appeared within the last X minutes. This is usually called a rolling time window.

For instance, if I have:

row        startime            orderNumber    catalogNumb
1        2007-09-24-15.50       o1              21    
2        2007-09-24-15.51       o2              21
3        2007-09-24-15.52       o2              21
4        2007-09-24-15.53       o3              21
5        2007-09-24-15.54       o4              22
6        2007-09-24-15.55       o4              23
7        2007-09-24-15.56       o4              21
8        2007-09-24-15.57       o4              21

For instance, if I want to get this for the last 5 minutes (5 is just one of the possible values), the output should be:

row        startime            orderNumber    catalogNumb    countCatalog
1        2007-09-24-15.50       o1              21                 1
2        2007-09-24-15.51       o2              22                 2
3        2007-09-24-15.52       o2              23                 3
4        2007-09-24-15.53       o3              24                 4
5        2007-09-24-15.54       o4              21                 4
6        2007-09-24-15.55       o4              21                 4 
7        2007-09-24-15.56       o4              21                 4
8        2007-09-24-15.57       o4              21                 3

I am using Big SQL for infosphere BigInsights v3.0. Resulting query can use any db2 Olap windows functions except for count (distinct catalogNumb) OVER()... which is not supported by my db2 version.

In addition to count, I may also need to use other aggregate functions (avg, sum...) over the catalogNumb and other attributes.

Any feedback would be appreciated.


回答1:


True Db2 does not support count distinct as OLAP function but there is an easy workaround:

You can use

dense_rank

instead - the highest number (max) from dense rank is your count distinct!




回答2:


You can try something like this:

select ...
  from mytable
  where starttime between current_time - 5 minutes and current_time

That will get all the rows for the last 5 minutes. 5 can be a variable. then count() or sum() or average() the rows.



来源:https://stackoverflow.com/questions/48483033/distinct-count-on-a-rolling-time-window

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!