Select data for 15 minute windows - PostgreSQL

前端 未结 1 566
一向
一向 2021-01-06 16:17

Right so I have a table such as this in PostgreSQL:

timestamp              duration

2013-04-03 15:44:58    4
2013-04-03 15:56:12    2
2013-04-03 16:13:17            


        
相关标签:
1条回答
  • 2021-01-06 16:35

    Quick and dirty way: http://sqlfiddle.com/#!1/bd2f6/21 I named my column tstamp instead of your timestamp

    with t as (
      select
        generate_series(mitstamp,matstamp,'15 minutes') as int,
        duration
      from
        (select min(tstamp) mitstamp, max(tstamp) as matstamp from tmp) a,
        (select duration from tmp group by duration) b
    )
    
    select
      int as timestampwindowstart,
      t.duration,
      count(tmp.duration)
    from
       t
       left join tmp on 
             (tmp.tstamp >= t.int and 
              tmp.tstamp < (t.int + interval '15 minutes') and 
              t.duration = tmp.duration)
    group by
      int,
      t.duration
    order by
      int,
      t.duration
    

    Brief explanation:

    1. Calculate minimum and maximum timestamp
    2. Generate 15 minutes intervals between minimum and maximum
    3. Cross join results with unique values of duration
    4. Left join original data (left join is important, because this will keep all possible combination in output and there will be null where duration does not exists for given interval.
    5. Aggregate data. count(null)=0

    In case you have more tables and the algorithm should be applied on their union. Suppose we have three tables tmp1, tmp2, tmp3 all with columns tstamp and duration. The we can extend the previous solution:

    with 
    
    tmpout as (
      select * from tmp1 union all
      select * from tmp2 union all
      select * from tmp3
    )
    
    ,t as (
      select
        generate_series(mitstamp,matstamp,'15 minutes') as int,
        duration
      from
        (select min(tstamp) mitstamp, max(tstamp) as matstamp from tmpout) a,
        (select duration from tmpout group by duration) b
    )
    
    select
      int as timestampwindowstart,
      t.duration,
      count(tmp.duration)
    from
       t
       left join tmpout on 
             (tmp.tstamp >= t.int and 
              tmp.tstamp < (t.int + interval '15 minutes') and 
              t.duration = tmp.duration)
    group by
      int,
      t.duration
    order by
      int,
      t.duration
    

    You should really know with clause in PostgreSQL. It is invaluable concept for any data analysis in PostgreSQL.

    0 讨论(0)
提交回复
热议问题