Right so I have a table such as this in PostgreSQL:
timestamp duration
2013-04-03 15:44:58 4
2013-04-03 15:56:12 2
2013-04-03 16:13:17
Quick and dirty way: http://sqlfiddle.com/#!1/bd2f6/21 I named my column tstamp
instead of your timestamp
with t as (
select
generate_series(mitstamp,matstamp,'15 minutes') as int,
duration
from
(select min(tstamp) mitstamp, max(tstamp) as matstamp from tmp) a,
(select duration from tmp group by duration) b
)
select
int as timestampwindowstart,
t.duration,
count(tmp.duration)
from
t
left join tmp on
(tmp.tstamp >= t.int and
tmp.tstamp < (t.int + interval '15 minutes') and
t.duration = tmp.duration)
group by
int,
t.duration
order by
int,
t.duration
Brief explanation:
null
where duration does not exists for given interval.count(null)=0
In case you have more tables and the algorithm should be applied on their union. Suppose we have three tables tmp1, tmp2, tmp3
all with columns tstamp
and duration
. The we can extend the previous solution:
with
tmpout as (
select * from tmp1 union all
select * from tmp2 union all
select * from tmp3
)
,t as (
select
generate_series(mitstamp,matstamp,'15 minutes') as int,
duration
from
(select min(tstamp) mitstamp, max(tstamp) as matstamp from tmpout) a,
(select duration from tmpout group by duration) b
)
select
int as timestampwindowstart,
t.duration,
count(tmp.duration)
from
t
left join tmpout on
(tmp.tstamp >= t.int and
tmp.tstamp < (t.int + interval '15 minutes') and
t.duration = tmp.duration)
group by
int,
t.duration
order by
int,
t.duration
You should really know with
clause in PostgreSQL. It is invaluable concept for any data analysis in PostgreSQL.