SQL query to return a grouped result as a single row

后端未结

关注

 2  1490

If I have a jobs table like:

|id|created_at  |status    |
----------------------------
|1 |01-01-2015  |error     |
|2 |01-01-2015  |complete  |
|3 |01-01-2015


                      
              相关标签:


      
      
        
          2条回答        

        
                         				            
            
           
            
                              
                
              
              
                
                  一个人的身影        
                
              
                            
                2021-01-25 03:36
              
            
            
                                                                       
An actual crosstab query would look like this:

SELECT * FROM crosstab(
   $$SELECT created_at, status, count(*) AS ct
     FROM   jobs 
     GROUP  BY 1, 2
     ORDER  BY 1, 2$$

  ,$$SELECT unnest('{error,complete,"on hold"}'::text[])$$)
AS ct (date date, errors int, completed int, on_hold int);


Should perform very well.

Basics:


PostgreSQL Crosstab Query


The above does not yet include the total per date.

Postgres 9.5 introduces the ROLLUP clause, which is perfect for the case:

SELECT * FROM crosstab(
 $$SELECT created_at, COALESCE(status, 'total'), ct
   FROM  (
      SELECT created_at, status, count(*) AS ct
      FROM   jobs 
      GROUP  BY created_at, ROLLUP(status)
      ) sub
   ORDER  BY 1, 2$$

  ,$$SELECT unnest('{total,error,complete,"on hold"}'::text[])$$)
AS ct (date date, total int, errors int, completed int, on_hold int);


Up to Postgres 9.4, use this query instead:

WITH cte AS (
    SELECT created_at, status, count(*) AS ct
    FROM   jobs 
    GROUP  BY 1, 2
    )
TABLE  cte
UNION  ALL
SELECT created_at, 'total', sum(ct)
FROM   cte 
GROUP  BY 1
ORDER  BY 1


Related:


Grouping() equivalent in PostgreSQL?




If you want to stick to a simple query, this is a bit shorter:

SELECT created_at
     , count(*) AS total
     , count(status = 'error' OR NULL)    AS errors
     , count(status = 'complete' OR NULL) AS completed
     , count(status = 'on hold' OR NULL)  AS on_hold
FROM   jobs 
GROUP  BY 1;


count(status) for the total per date is error-prone, because it would not count rows with NULL values in status. Use count(*) instead, which is also shorter and a bit faster.

Here is a list of techniques:


For absolute performance, is SUM faster or COUNT?


In Postgres 9.4+ use the new aggregate FILTER clause, like @a_horse mentioned:

SELECT created_at
     , count(*) AS total
     , count(*) FILTER (WHERE status = 'error')    AS errors
     , count(*) FILTER (WHERE status = 'complete') AS completed
     , count(*) FILTER (WHERE status = 'on hold')  AS on_hold
FROM   jobs 
GROUP  BY 1;


Details:


How can I simplify this game statistics query?

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  滥情空心        
                
              
                            
                2021-01-25 03:56
              
            
            
                                                                       
The following should work in any RDBMS:

SELECT created_at, count(status) AS total,
       sum(case when status = 'error' then 1 end) as errors,
       sum(case when status = 'complete' then 1 end) as completed,
       sum(case when status = 'on hold' then 1 end) as on_hold
FROM jobs 
GROUP BY created_at;


The query uses conditional aggregation so as to pivot grouped data. It assumes that status values are known before-hand. If you have additional cases of status values, just add the corresponding sum(case ... expression.

Demo here
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
                             
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复