Is there a [straightforward] way to order results *first*, *then* group by another column, with SQL?

前端 未结 5 911
遥遥无期
遥遥无期 2021-01-12 19:07

I see that in SQL, the GROUP BY has to precede ORDER BY expression. Does this imply that ordering is done after grouping discards identical rows/columns?

Because I s

相关标签:
5条回答
  • 2021-01-12 19:42
    create table user_payments
    (
        phone_nr int NOT NULL,
        payed_until_ts datetime NOT NULL
    )
    
    insert into user_payments
    (phone_nr, payed_until_ts)
    values
    (1, '2016-01-28'), -- today
    (1, '2016-01-27'), -- yesterday  
    (2, '2016-01-27'), -- yesterday 
    (2, '2016-01-29')  -- tomorrow
    
    select phone_nr, MAX(payed_until_ts) as latest_payment
    from user_payments
    group by phone_nr
    
    -- OUTPUT:
    -- phone_nr latest_payment
    -- 1        2016-01-28 00:00:00.000
    -- 2        2016-01-29 00:00:00.000
    

    In the above example, I have used datetime column but similar query should work for timestamp column.

    The MAX function will basically do the "ORDER BY" payed_until_ts column and pick the latest value for each phone_nr. Also, you will get only one value for each phone_nr due to "GROUP BY" clause.

    0 讨论(0)
  • 2021-01-12 19:46
    Select a,b from (select a,b from table order by b) as c group by a;
    
    0 讨论(0)
  • 2021-01-12 19:46
    SELECT DISTINCT a,b
    FROM tbl t
    WHERE b = (SELECT MAX(b) FROM tbl WHERE tbl.a = t.a);
    
    0 讨论(0)
  • 2021-01-12 19:47

    According to your new rules (tested with PostgreSQL)


    Query You'd Want:

    SELECT    pr.phone_nr, pr.payed_ts, pr.payed_until_ts 
    FROM      payment_receipts pr
    JOIN      users
              ON (pr.phone_nr = users.phone_nr)
       JOIN      (select phone_nr, max(payed_until_ts) as payed_until_ts 
                  from payment_receipts 
                  group by phone_nr
                 ) sub
                 ON (    pr.phone_nr       = sub.phone_nr 
                     AND pr.payed_until_ts = sub.payed_until_ts)
    ORDER BY  pr.phone_nr, pr.payed_ts, pr.payed_until_ts;
    


    Original Answer (with updates):

    CREATE TABLE foo (a NUMERIC, b TEXT, DATE);
    
    INSERT INTO foo VALUES 
       (1,'a','2010-07-30'),
       (1,'b','2010-07-30'),
       (1,'c','2010-07-31'),
       (1,'d','2010-07-31'),
       (1,'a','2010-07-29'),
       (1,'c','2010-07-29'),
       (2,'a','2010-07-29'),
       (2,'a','2010-08-01');
    
    -- table contents
    SELECT * FROM foo ORDER BY c,a,b;
     a | b |     c      
    ---+---+------------
     1 | a | 2010-07-29
     1 | c | 2010-07-29
     2 | a | 2010-07-29
     1 | a | 2010-07-30
     1 | b | 2010-07-30
     1 | c | 2010-07-31
     1 | d | 2010-07-31
     2 | a | 2010-08-01
    
    -- The following solutions both retrieve records based on the latest date
    --    they both return the same result set, solution 1 is faster, solution 2
    --    is easier to read
    
    -- Solution 1: 
    SELECT    foo.a, foo.b, foo.c 
    FROM      foo
    JOIN      (select a, max(c) as c from foo group by a) bar
      ON      (foo.a=bar.a and foo.c=bar.c)
    ORDER BY  foo.a, foo.b, foo.c;
    
    -- Solution 2: 
    SELECT    a, b, MAX(c) AS c 
    FROM      foo main
    GROUP BY  a, b
    HAVING    MAX(c) = (select max(c) from foo sub where main.a=sub.a group by a)
    ORDER BY  a, b;
    
     a | b |     c      
    ---+---+------------
     1 | c | 2010-07-31
     1 | d | 2010-07-31
     2 | a | 2010-08-01
    (3 rows)  
    


    Comment:
    1 is returned twice because their are multiple b values. This is acceptable (and advised). Your data should never have this problem, because c is based on b's value.

    0 讨论(0)
  • 2021-01-12 19:50

    Yes, grouping is done first, and it affects a single select whereas ordering affects all the results from all select statements in a union, such as:

    select a, 'max', max(b) from tbl group by a
    union all select a, 'min', min(b) from tbl group by a
    order by 1, 2
    

    (using field numbers in order by since I couldn't be bothered to name my columns). Each group by affects only its select, the order by affects the combined result set.

    It seems that what you're after can be achieved with:

    select A, max(B) from tbl group by A
    

    This uses the max aggregation function to basically do your pre-group ordering (it doesn't actually sort it in any decent DBMS, rather it will simply choose the maximum from an suitable index if available).

    0 讨论(0)
提交回复
热议问题