Add missing data from previous month or year cumulatively

后端 未结 2 1362
小蘑菇
小蘑菇 2020-12-22 09:55

Say I have the following data:

select 1 id, \'A\' name, \'2007\' year, \'04\' month,  5 sales  from dual union all
select 2 id, \'A\' name, \'2007\' year, \'         


        
相关标签:
2条回答
  • 2020-12-22 10:37

    A variation on @boneists approach, starting with your sample data in a CTE:

    with t as (
      select 1 id, 'A' name, '2007' year, '04' month,  5 sales  from dual union all
      select 2 id, 'A' name, '2007' year, '05' month,  2 sales  from dual union all
      select 3 id, 'B' name, '2008' year, '12' month,  3 sales  from dual union all
      select 4 id, 'B' name, '2009' year, '12' month, 56 sales  from dual union all
      select 5 id, 'C' name, '2009' year, '08' month, 89 sales  from dual union all
      select 13 id,'B' name, '2016' year, '01' month, 10 sales  from dual union all
      select 14 id,'A' name, '2016' year, '02' month,  8 sales  from dual union all
      select 15 id,'D' name, '2016' year, '03' month, 12 sales  from dual union all
      select 16 id,'E' name, '2016' year, '04' month, 34 sales  from dual
    ),
    y (year, rnk) as (
      select year, dense_rank() over (order by year)
      from (select distinct year from t)
    ),
    r (name, year, month, sales, rnk) as (
      select t.name, t.year, t.month, t.sales, y.rnk
      from t
      join y on y.year = t.year
      union all
      select r.name, y.year, r.month, 0, y.rnk
      from y
      join r on r.rnk = y.rnk - 1
      where not exists (
        select 1 from t where t.year = y.year and t.month = r.month and t.name = r.name
      )
    )
    select name, year, month, sales,
      nvl(sum(sales) over (partition by name order by year, month
        rows between unbounded preceding and 1 preceding), 0) as opening_bal,
      nvl(sum(sales) over (partition by name order by year, month
        rows between unbounded preceding and current row), 0) as closing_bal
    from r
    order by year, month, name;
    

    Which gets the same result too, though it also doesn't match the expected results in the question:

    NAME YEAR MONTH      SALES OPENING_BAL CLOSING_BAL
    ---- ---- ----- ---------- ----------- -----------
    A    2007 04             5           0           5
    A    2007 05             2           5           7
    A    2008 04             0           7           7
    A    2008 05             0           7           7
    B    2008 12             3           0           3
    A    2009 04             0           7           7
    A    2009 05             0           7           7
    C    2009 08            89           0          89
    B    2009 12            56           3          59
    B    2016 01            10          59          69
    A    2016 02             8           7          15
    D    2016 03            12           0          12
    A    2016 04             0          15          15
    E    2016 04            34           0          34
    A    2016 05             0          15          15
    C    2016 08             0          89          89
    B    2016 12             0          69          69
    

    The y CTE (feel free to use more meaningful names!) generates all the distinct years from your original data, and also adds a ranking, so 2007 is 1, 2008 is 2, 2009 is 3, and 2016 is 4.

    The r recursive CTE combines your actual data with dummy rows with zero sales, based on the name/month data from previous years.

    From what that recursive CTE produces you can do your analytic cumulative sum to add the opening/closing balances. This is using windowing clauses to decide which sales values to include - essentially the opening and closing balances are the sum of all values up to this point, but opening doesn't include the current row.

    0 讨论(0)
  • 2020-12-22 10:41

    This is the closest I can get to your result, although I realise it's not an exact match. For example, your opening balances don't look correct (where did the opening balance of 12 come from for the output row for id = 3?). Anyway, hopefully the following will enable you to amend as appropriate:

    with sample_data as (select 1 id, 'A' name, '2007' year, '04' month,  5 sales  from dual union all
                         select 2 id, 'A' name, '2007' year, '05' month,  2 sales  from dual union all
                         select 3 id, 'B' name, '2008' year, '12' month,  3 sales  from dual union all
                         select 4 id, 'B' name, '2009' year, '12' month, 56 sales  from dual union all
                         select 5 id, 'C' name, '2009' year, '08' month, 89 sales  from dual union all
                         select 13 id, 'B' name, '2016' year, '01' month, 10 sales  from dual union all
                         select 14 id, 'A' name, '2016' year, '02' month,  8 sales  from dual union all
                         select 15 id, 'D' name, '2016' year, '03' month, 12 sales  from dual union all
                         select 16 id, 'E' name, '2016' year, '04' month, 34 sales  from dual),
                 dts as (select distinct year
                         from   sample_data),
                 res as (select sd.name,
                                dts.year,
                                sd.month,
                                nvl(sd.sales, 0) sales,
                                min(sd.year) over (partition by sd.name, sd.month) min_year_per_name_month,
                                sum(nvl(sd.sales, 0)) over (partition by name order by to_date(dts.year||'-'||sd.month, 'yyyy-mm')) - nvl(sd.sales, 0) as opening,
                                sum(nvl(sd.sales, 0)) over (partition by name order by to_date(dts.year||'-'||sd.month, 'yyyy-mm')) as closing
                         from   dts
                                left outer join sample_data sd partition by (sd.name, sd.month) on (sd.year = dts.year))
    select name,
           year,
           month,
           sales,
           opening,
           closing
    from   res
    where  (opening != 0 or closing != 0)
    and    year >= min_year_per_name_month
    order by to_date(year||'-'||month, 'yyyy-mm'),
             name;
    
    NAME YEAR MONTH      SALES    OPENING    CLOSING
    ---- ---- ----- ---------- ---------- ----------
    A    2007 04             5          0          5
    A    2007 05             2          5          7
    A    2008 04             0          7          7
    A    2008 05             0          7          7
    B    2008 12             3          0          3
    A    2009 04             0          7          7
    A    2009 05             0          7          7
    C    2009 08            89          0         89
    B    2009 12            56          3         59
    B    2016 01            10         59         69
    A    2016 02             8          7         15
    D    2016 03            12          0         12
    A    2016 04             0         15         15
    E    2016 04            34          0         34
    A    2016 05             0         15         15
    C    2016 08             0         89         89
    B    2016 12             0         69         69
    

    I've used Partition Outer Join to link any month and name combination in the table (in my query, the sample_data subquery - you wouldn't need that subquery, you'd just use your table instead!) to any year in the same table, and then working out the opening / closing balances. I then discard any rows that have an opening and closing balance of 0.

    0 讨论(0)
提交回复
热议问题