SQL query to return a grouped result as a single row

后端 未结 2 1492
北恋
北恋 2021-01-25 02:59

If I have a jobs table like:

|id|created_at  |status    |
----------------------------
|1 |01-01-2015  |error     |
|2 |01-01-2015  |complete  |
|3 |01-01-2015           


        
2条回答
  •  一个人的身影
    2021-01-25 03:36

    An actual crosstab query would look like this:

    SELECT * FROM crosstab(
       $$SELECT created_at, status, count(*) AS ct
         FROM   jobs 
         GROUP  BY 1, 2
         ORDER  BY 1, 2$$
    
      ,$$SELECT unnest('{error,complete,"on hold"}'::text[])$$)
    AS ct (date date, errors int, completed int, on_hold int);
    

    Should perform very well.

    Basics:

    • PostgreSQL Crosstab Query

    The above does not yet include the total per date.
    Postgres 9.5 introduces the ROLLUP clause, which is perfect for the case:

    SELECT * FROM crosstab(
     $$SELECT created_at, COALESCE(status, 'total'), ct
       FROM  (
          SELECT created_at, status, count(*) AS ct
          FROM   jobs 
          GROUP  BY created_at, ROLLUP(status)
          ) sub
       ORDER  BY 1, 2$$
    
      ,$$SELECT unnest('{total,error,complete,"on hold"}'::text[])$$)
    AS ct (date date, total int, errors int, completed int, on_hold int);
    

    Up to Postgres 9.4, use this query instead:

    WITH cte AS (
        SELECT created_at, status, count(*) AS ct
        FROM   jobs 
        GROUP  BY 1, 2
        )
    TABLE  cte
    UNION  ALL
    SELECT created_at, 'total', sum(ct)
    FROM   cte 
    GROUP  BY 1
    ORDER  BY 1
    

    Related:

    • Grouping() equivalent in PostgreSQL?

    If you want to stick to a simple query, this is a bit shorter:

    SELECT created_at
         , count(*) AS total
         , count(status = 'error' OR NULL)    AS errors
         , count(status = 'complete' OR NULL) AS completed
         , count(status = 'on hold' OR NULL)  AS on_hold
    FROM   jobs 
    GROUP  BY 1;
    

    count(status) for the total per date is error-prone, because it would not count rows with NULL values in status. Use count(*) instead, which is also shorter and a bit faster.

    Here is a list of techniques:

    • For absolute performance, is SUM faster or COUNT?

    In Postgres 9.4+ use the new aggregate FILTER clause, like @a_horse mentioned:

    SELECT created_at
         , count(*) AS total
         , count(*) FILTER (WHERE status = 'error')    AS errors
         , count(*) FILTER (WHERE status = 'complete') AS completed
         , count(*) FILTER (WHERE status = 'on hold')  AS on_hold
    FROM   jobs 
    GROUP  BY 1;
    

    Details:

    • How can I simplify this game statistics query?

提交回复
热议问题