I see that in SQL, the GROUP BY has to precede ORDER BY expression. Does this imply that ordering is done after grouping discards identical rows/columns?
Because I s
create table user_payments
(
phone_nr int NOT NULL,
payed_until_ts datetime NOT NULL
)
insert into user_payments
(phone_nr, payed_until_ts)
values
(1, '2016-01-28'), -- today
(1, '2016-01-27'), -- yesterday
(2, '2016-01-27'), -- yesterday
(2, '2016-01-29') -- tomorrow
select phone_nr, MAX(payed_until_ts) as latest_payment
from user_payments
group by phone_nr
-- OUTPUT:
-- phone_nr latest_payment
-- 1 2016-01-28 00:00:00.000
-- 2 2016-01-29 00:00:00.000
In the above example, I have used datetime column but similar query should work for timestamp column.
The MAX function will basically do the "ORDER BY" payed_until_ts column and pick the latest value for each phone_nr. Also, you will get only one value for each phone_nr due to "GROUP BY" clause.
Select a,b from (select a,b from table order by b) as c group by a;
SELECT DISTINCT a,b
FROM tbl t
WHERE b = (SELECT MAX(b) FROM tbl WHERE tbl.a = t.a);
Query You'd Want:
SELECT pr.phone_nr, pr.payed_ts, pr.payed_until_ts
FROM payment_receipts pr
JOIN users
ON (pr.phone_nr = users.phone_nr)
JOIN (select phone_nr, max(payed_until_ts) as payed_until_ts
from payment_receipts
group by phone_nr
) sub
ON ( pr.phone_nr = sub.phone_nr
AND pr.payed_until_ts = sub.payed_until_ts)
ORDER BY pr.phone_nr, pr.payed_ts, pr.payed_until_ts;
Original Answer (with updates):
CREATE TABLE foo (a NUMERIC, b TEXT, DATE);
INSERT INTO foo VALUES
(1,'a','2010-07-30'),
(1,'b','2010-07-30'),
(1,'c','2010-07-31'),
(1,'d','2010-07-31'),
(1,'a','2010-07-29'),
(1,'c','2010-07-29'),
(2,'a','2010-07-29'),
(2,'a','2010-08-01');
-- table contents
SELECT * FROM foo ORDER BY c,a,b;
a | b | c
---+---+------------
1 | a | 2010-07-29
1 | c | 2010-07-29
2 | a | 2010-07-29
1 | a | 2010-07-30
1 | b | 2010-07-30
1 | c | 2010-07-31
1 | d | 2010-07-31
2 | a | 2010-08-01
-- The following solutions both retrieve records based on the latest date
-- they both return the same result set, solution 1 is faster, solution 2
-- is easier to read
-- Solution 1:
SELECT foo.a, foo.b, foo.c
FROM foo
JOIN (select a, max(c) as c from foo group by a) bar
ON (foo.a=bar.a and foo.c=bar.c)
ORDER BY foo.a, foo.b, foo.c;
-- Solution 2:
SELECT a, b, MAX(c) AS c
FROM foo main
GROUP BY a, b
HAVING MAX(c) = (select max(c) from foo sub where main.a=sub.a group by a)
ORDER BY a, b;
a | b | c
---+---+------------
1 | c | 2010-07-31
1 | d | 2010-07-31
2 | a | 2010-08-01
(3 rows)
Comment:
1
is returned twice because their are multiple b
values. This is acceptable (and advised). Your data should never have this problem, because c
is based on b
's value.
Yes, grouping is done first, and it affects a single select
whereas ordering affects all the results from all select
statements in a union
, such as:
select a, 'max', max(b) from tbl group by a
union all select a, 'min', min(b) from tbl group by a
order by 1, 2
(using field numbers in order by
since I couldn't be bothered to name my columns). Each group by
affects only its select
, the order by
affects the combined result set.
It seems that what you're after can be achieved with:
select A, max(B) from tbl group by A
This uses the max
aggregation function to basically do your pre-group ordering (it doesn't actually sort it in any decent DBMS, rather it will simply choose the maximum from an suitable index if available).