Selecting sum and running balance for last 18 months with generate_series

会有一股神秘感。 提交于 2019-12-11 00:59:59

问题


I have this working query, but I need to add all months to my result, no matter if the items sold during that month:

select * from (
select
to_char(max(change_date), 'YYYY-MON')::varchar(8) as yyyymmm,
max(change_date) as yearmonth,
sum(vic.sold_qty / item_size.qty)::numeric(18,2) as sold_qty,  -- sold monthly
sum(sum(on_hand)) OVER (PARTITION BY vic.item_id order by year,month) as on_hand --running balance
from (((view_item_change vic
left join item on vic.item_id = item.item_id)
left join item_size on item_size.item_id = vic.item_id and item_size.name = item.sell_size)
left join item_plu on vic.item_id = item_plu.item_id and item_plu.seq_num = 0)
where 1 = 1   -- cannot limit date here as its used to show running balance.
and vic.change_date < current_date - date_part('day',current_date)::integer --show only till end of last month
and item.item_id = (select item_id from item_plu where number = '51515')
group by vic.item_id, year, month
) as t 
where yearmonth > current_date - date_part('day',current_date)::integer - 540 -- 18 months

which gives me something like this:

"2013-JUN";"2013-06-29";0.00;7.0000
"2013-JUL";"2013-07-22";0.00;6.0000
"2013-AUG";"2013-08-28";2.00;4.0000
"2013-SEP";"2013-09-02";0.00;4.0000
"2013-OCT";"2013-10-28";0.00;4.0000
"2013-NOV";"2013-11-15";0.00;4.0000
"2013-DEC";"2013-12-16";0.00;6.0000
"2014-FEB";"2014-02-10";1.00;5.0000
"2014-APR";"2014-04-09";0.00;5.0000

But I also want to show the months 2014-JAN and 2014-MAR so that my chart will be better time scaled.

I know how to do the generate_series(start_date, end_date, '1 month') intervals, but I can't quite see how I can join this series to the result set above.

I accepted first answer, but after 2 weeks of testing, found a problem. The left join to the series and then adding a coalesce to show 0's and not nulls for the empty months causes a problem wit the running balance.

Resulting query:

SELECT yyyymmm, yyyymmm, 
coalesce(sold_qty,0) sold_qty, coalesce(on_hand,0) on_hand
FROM  (
   SELECT date_trunc('month', month_series)::date as yyyymmm
   FROM   generate_series(current_date - date_part('day',current_date)::integer - 540  
   ,current_date- date_part('day',current_date)::integer
   , interval '1 month') month_series
   ) month_series
LEFT  JOIN (
    select * from (
        select
        date_trunc('month', max(change_date))::date as yyyymmm,
        max(change_date) as yearmonth,
        sum(vic.sold_qty / item_size.qty)::numeric(18,2) as sold_qty,
        sum(sum(on_hand)) OVER (PARTITION BY vic.item_id order by year,month) as on_hand
        from (((view_item_change vic
        left join item on vic.item_id = item.item_id)
        left join item_size on item_size.item_id = vic.item_id and item_size.name = item.sell_size)
        left join item_plu on vic.item_id = item_plu.item_id and item_plu.seq_num = 0)
        where 1 = 1   -- cannot limit date here as its used to show running balance.
        --vic.change_date >= current_date - date_part('day',current_date)::integer - 730  -- only get results for last 
        --show only till end of last month
        and vic.change_date <= current_date - date_part('day',current_date)::integer
        and item.item_id = (select item_id from item_plu where number = '19M7077')
        group by vic.item_id, year, month
    ) as a 
    where yyyymmm > current_date - date_part('day',current_date)::integer - 540 -- 18 months
) q USING (yyyymmm)
order by 1

Results I get:

"2013-07-01";"2013-07-01";0;0
"2013-08-01";"2013-08-01";0;0
"2013-09-01";"2013-09-01";1.00;53.0000
"2013-10-01";"2013-10-01";0;0
"2013-11-01";"2013-11-01";0;0
"2013-12-01";"2013-12-01";0.00;53.0000
"2014-01-01";"2014-01-01";0.00;52.0000
"2014-02-01";"2014-02-01";0;0
"2014-03-01";"2014-03-01";0;0
"2014-04-01";"2014-04-01";0;0

But I want:

"2013-07-01";"2013-07-01";0;53.0000
"2013-08-01";"2013-08-01";0;53.0000
"2013-09-01";"2013-09-01";1.00;53.0000
"2013-10-01";"2013-10-01";0;53.0000
"2013-11-01";"2013-11-01";0;0;53.0000
"2013-12-01";"2013-12-01";0.00;53.0000
"2014-01-01";"2014-01-01";0.00;52.0000
"2014-02-01";"2014-02-01";0;0;52.0000
"2014-03-01";"2014-03-01";0;0;52.0000
"2014-04-01";"2014-04-01";0;0;52.0000

Table definitions

CREATE TABLE item
(
  item_id character(22) NOT NULL,
  version integer NOT NULL,
  created_by character varying(16) NOT NULL,
  updated_by character varying(16),
  inactive_by character varying(16),
  created_on date NOT NULL,
  updated_on date,
  inactive_on date,
  external_id numeric(14,0),
  description character varying(40) NOT NULL,
  dept_id character(22),
  subdept_id character(22),
  sell_size character varying(8) NOT NULL,
  purch_size character varying(8) NOT NULL
);
CREATE TABLE item_change
(
  item_id character(22) NOT NULL,
  size_name character varying(8) NOT NULL,
  store_id character(22) NOT NULL,
  change_date date NOT NULL,
  on_hand numeric(18,4) NOT NULL,   -- sum column / item_id = total on_hand
  total_cost numeric(18,4) NOT NULL,
  on_order numeric(18,4) NOT NULL,
  sold_qty numeric(18,4) NOT NULL,
  sold_cost numeric(18,4) NOT NULL,
  sold_price numeric(18,4) NOT NULL,
  recv_qty numeric(18,4) NOT NULL,
  recv_cost numeric(18,4) NOT NULL,
  adj_qty numeric(18,4) NOT NULL,
  adj_cost numeric(18,4) NOT NULL
);

CREATE TABLE item_size
(
  item_id character(22) NOT NULL,
  seq_num integer NOT NULL,
  name character varying(8) NOT NULL,
  qty numeric(18,4) NOT NULL,
  weight numeric(18,4) NOT NULL,
  CONSTRAINT item_size_pkey PRIMARY KEY (item_id, seq_num),
  CONSTRAINT item_size_c0 FOREIGN KEY (item_id)
      REFERENCES item (item_id) MATCH SIMPLE
      ON UPDATE NO ACTION ON DELETE CASCADE,
  CONSTRAINT item_size_c1 UNIQUE (item_id, name)
);

CREATE TABLE item_plu
(
  item_id character(22) NOT NULL,
  seq_num integer NOT NULL,
  "number" character varying(18) NOT NULL,
  size_name character varying(8)
);


CREATE OR REPLACE VIEW view_item_change AS 
 SELECT date_part('year'::text, item_change.change_date) AS year,
date_part('month'::text, item_change.change_date) AS month,
date_part('week'::text, item_change.change_date) AS week,
date_part('quarter'::text, item_change.change_date) AS quarter,
date_part('dow'::text, item_change.change_date) AS dow,
item_change.item_id,
item_change.size_name,
item_change.store_id,
item_change.change_date,
item_change.on_hand,
item_change.total_cost,
item_change.on_order,
item_change.sold_qty,
item_change.sold_cost,
item_change.sold_price,
item_change.recv_qty,
item_change.recv_cost,
item_change.adj_qty,
item_change.adj_cost
FROM item_change;

There is only a single row per month because I also do a group by year, month. As you see, the view has the year, quarter, month, week, dow columns for easier reporting.

To get working data for this, if required, I can do it, but it would require manually creating it. I have simplified the tables and left out all constraints and some columns.


回答1:


Basic solution

Generate a complete list of months and LEFT JOIN the rest to it:

SELECT *
FROM  (
   SELECT to_char(m, 'YYYY-MON') AS yyyymmm
   FROM   generate_series(<start_date>, <end_date>, interval '1 month') m
   ) m
LEFT  JOIN ( <your query here> ) q USING (yyyymmm);

Related answers with more explanation:

  • Join a count query on a generate_series in postgres and also retrieve Null-values as "0"
  • Best way to count records by arbitrary time intervals in Rails+Postgres

Advanced solution for your case

Your query is more complicated than I first understood. You need the running sum over all rows of the selected item, then you want to trim rows older than a minimum date, and fill in missing months with the pre-calculated sum of the previous month.

I achieve this now with LEFT JOIN LATERAL.

SELECT COALESCE(m.yearmonth, c.yearmonth)::date, sold_qty, on_hand
FROM  (
   SELECT yearmonth
        , COALESCE(sold_qty, 0) AS sold_qty
        , sum(on_hand_mon) OVER (ORDER BY yearmonth) AS on_hand
        , lead(yearmonth)  OVER (ORDER BY yearmonth)
                                - interval '1 month' AS nextmonth
   FROM (
      SELECT date_trunc('month', c.change_date) AS yearmonth
           , sum(c.sold_qty / s.qty)::numeric(18,2) AS sold_qty
           , sum(c.on_hand) AS on_hand_mon
      FROM   item_change      c         
      LEFT   JOIN item        i USING (item_id)
      LEFT   JOIN item_size   s ON s.item_id = i.item_id AND s.name = i.sell_size
      LEFT   JOIN item_plu    p ON p.item_id = i.item_id AND p.seq_num = 0
      WHERE  c.change_date < date_trunc('month', now()) - interval '1 day'
      AND    c.item_id = (SELECT item_id FROM item_plu WHERE number = '51515')
      GROUP  BY 1
      ) sub
   ) c
LEFT   JOIN LATERAL generate_series(c.yearmonth
                                  , c.nextmonth
                                  , interval '1 month') m(yearmonth) ON TRUE
WHERE  c.yearmonth > date_trunc('year', now()) - interval '540 days'
ORDER  BY COALESCE(m.yearmonth, c.yearmonth);

SQL Fiddle with a minimum test case.

Major points:

  • I removed your VIEW from the query completely. Much cost for no gain.

  • Since you select a single item_id, you don't need to GROUP BY item_id or PARTITION BY item_id.

  • Use short table aliases and make all reference unambiguous - especially when posting in a public forum.

  • Parentheses in your joins were just noise. Joins are executed left-to-right anyway by default.

  • Simplified date bounds (since I operate with timestamps):

    date_trunc('year', current_date)  - interval '540 days'
    date_trunc('month', current_date) - interval '1 day'
    

    equivalent, but simpler & faster than:

    current_date - date_part('day',current_date)::integer - 540
    current_date - date_part('day',current_date)::integer
  • I now fill in missing months after all calculations with generate_series() calls per row.

  • It must be LEFT JOIN LATERAL ... ON TRUE, not the short form of a JOIN LATERAL to catch the corner case of the last row. Detailed explanation:

    • Find most common elements in array with a group by

Important side notes:

character(22) is a terrible data type for a primary key (or any column). Details:

  • Any downsides of using data type "text" for storing strings?

Ideally this would be an int or bigint column, or possibly a UUID.

Also, storing money amounts as money type or integer(representing Cents) performs much better overall.

In the long run, performance is bound to deteriorate, since you have to include all rows from the very beginning in your calculation. You should cut off old rows and materialize the balance of on_hold on a yearly basis or something.



来源:https://stackoverflow.com/questions/27670068/selecting-sum-and-running-balance-for-last-18-months-with-generate-series

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!