Postgres - how to return rows with 0 count for missing data?

随声附和 提交于 2019-11-27 19:30:44

You can create the list of all first days of the last year (say) with

select distinct date_trunc('month', (current_date - offs)) as date 
from generate_series(0,365,28) as offs;
          date
------------------------
 2007-12-01 00:00:00+01
 2008-01-01 00:00:00+01
 2008-02-01 00:00:00+01
 2008-03-01 00:00:00+01
 2008-04-01 00:00:00+02
 2008-05-01 00:00:00+02
 2008-06-01 00:00:00+02
 2008-07-01 00:00:00+02
 2008-08-01 00:00:00+02
 2008-09-01 00:00:00+02
 2008-10-01 00:00:00+02
 2008-11-01 00:00:00+01
 2008-12-01 00:00:00+01

Then you can join with that series.

Erwin Brandstetter

This question is old. But since fellow users picked it as master for a new duplicate I am adding a proper answer.

Proper solution

SELECT *
FROM  (
   SELECT day::date
   FROM   generate_series(timestamp '2007-12-01'
                        , timestamp '2008-12-01'
                        , interval  '1 month') day
   ) d
LEFT   JOIN (
   SELECT date_trunc('month', date_col)::date AS day
        , count(*) AS some_count
   FROM   tbl
   WHERE  date_col >= date '2007-12-01'
   AND    date_col <= date '2008-12-06'
-- AND    ... more conditions
   GROUP  BY 1
   ) t USING (day)
ORDER  BY day;
  • Use LEFT JOIN, of course.

  • generate_series() can produce a table of timestamps on the fly, and very fast.

  • It's generally faster to aggregate before you join. I recently provided a test case on sqlfiddle.com in this related answer:

  • Cast the timestamp to date (::date) for a basic format. For more use to_char().

  • GROUP BY 1 is syntax shorthand to reference the first output column. Could be GROUP BY day as well, but that might conflict with an existing column of the same name. Or GROUP BY date_trunc('month', date_col)::date but that's too long for my taste.

  • Works with the available interval arguments for date_trunc().

  • count() never produces NULL (0 for no rows), but the LEFT JOIN does.
    To return 0 instead of NULL in the outer SELECT, use COALESCE(some_count, 0) AS some_count. The manual.

  • For a more generic solution or arbitrary time intervals consider this closely related answer:

You could create a temporary table at runtime and left join on that. That seems to make the most sense.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!