compute sum of values associated with overlapping date ranges

旧时模样 提交于 2020-01-11 12:03:38

问题


I have a simple table of date ranges each with an associated number of hours per week:

CREATE TABLE tmp_ranges (
  id SERIAL PRIMARY KEY,
  rng daterange,
  hrs_per_week INT
 );

And some values from which I would like to compute (ie aggregate) the sum of hours per week for the overlapping/intersecting date ranges:

INSERT INTO tmp_ranges (rng, hrs_per_week) VALUES
   ('[2014-03-15, 2014-06-28]', 9),
   ('[2014-04-18, 2014-07-15]', 2),
   ('[2014-06-03, 2014-09-12]', 9),
   ('[2014-10-03, 2014-11-14]', 6);

Graphically (and hopefully this reveals more than it obscures), the solution looks as follows:

hrs/wk      T                                                 T`
  9         |  }-----|--------|-------->                      |
            |                                                 |
  2         |        }--------|--------|----->                |
            |                                                 |
  9         |                 }--------|------|---->          |
            |                                                 |
  6         |                                          }--->  |
            |                                                 |
 agg.hrs/wk     --9-- ---11--- ---20--- --11-- --9--    -6- 

The final date range is deliberately non-contiguous with the other records but would still be included in the final recordset...
Clearly the solution entails generating 6 records from the original 4 and I'm pretty sure that the answer involves using window functions but I'm completely at a loss...

Is there a way to accomplish this?

Many thanks in advance!


回答1:


Here is my attempt to solve this problem:

select y,
     sum( hrs_per_week )
from tmp_ranges t
join(
  select daterange( x,
         lead(x) over (order by x) ) As y
  from (
    select lower( rng ) As x
    from tmp_ranges
    union 
    select upper( rng )
    from tmp_ranges
    order by x
  ) y
) y
on t.rng && y.y
group by y
order by y

Demo: http://sqlfiddle.com/#!15/ef6cb/13

The innermost subquery collects all boundary dates into one set using union, then sorts them.
Then the outer subquery builds new ranges from adjacent dates using lead function.
In the end, these new ranges are joined to the source table in the main query, aggregated, and sum is calculated.


EDIT
The order by clause in the innermost query is redundant and can be skipped, because lead(x) over caluse orders records by dates, and a resultset from the innermost subquery doesn't have to be sorted.



来源:https://stackoverflow.com/questions/22232796/compute-sum-of-values-associated-with-overlapping-date-ranges

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!