I have the following schema:
CREATE TABLE author (
id integer
, name varchar(255)
);
CREATE TABLE book (
id integer
, author_id integer
,
This may look archaic and overly simple, but it does not depend on window functions, CTE's and aggregating subqueries. In most cases it is also the fastest.
SELECT bk.id, au.id, au.name, bk.title as last_book
FROM author au
JOIN book bk ON bk.author_id = au.id
WHERE NOT EXISTS (
SELECT *
FROM book nx
WHERE nx.author_id = bk.author_id
AND nx.book_id > bk.book_id
)
ORDER BY book.id ASC
;
As a slight variation on @wildplasser's suggestion, which still works across implementations, you can use max rather than not exists. This reads better if you like short joins better than long where clauses
select *
from author au
join (
select max(id) as max_id, author_id
from book bk
group by author_id) as lb
on lb.author_id = au.id
join bk
on bk.id = lb.max_id;
or, to give a name to the subquery, which clarifies things, go with WITH
with last_book as
(select max(id) as max_id, author_id
from book bk
group by author_id)
select *
from author au
join last_book lb
on au.id = lb.author_id
join bk
on bk.id = lb.max_id;
I've done something similar for a chat system, where room holds the metadata and list contains the messages. I ended up using the Postgresql LATERAL JOIN which worked like a charm.
SELECT MR.id AS room_id, MR.created_at AS room_created,
lastmess.content as lastmessage_content, lastmess.datetime as lastmessage_when
FROM message.room MR
LEFT JOIN LATERAL (
SELECT content, datetime
FROM message.list
WHERE room_id = MR.id
ORDER BY datetime DESC
LIMIT 1) lastmess ON true
ORDER BY lastmessage_when DESC NULLS LAST, MR.created_at DESC
For more info see https://heap.io/blog/engineering/postgresqls-powerful-new-join-type-lateral
select distinct on (author.id)
book.id, author.id, author.name, book.title as last_book
from
author
inner join
book on book.author_id = author.id
order by author.id, book.id desc
Check distinct on
SELECT DISTINCT ON ( expression [, ...] ) keeps only the first row of each set of rows where the given expressions evaluate to equal. The DISTINCT ON expressions are interpreted using the same rules as for ORDER BY (see above). Note that the "first row" of each set is unpredictable unless ORDER BY is used to ensure that the desired row appears first.
With distinct on it is necessary to include the "distinct" columns in the order by
. If that is not the order you want then you need to wrap the query and reorder
select
*
from (
select distinct on (author.id)
book.id, author.id, author.name, book.title as last_book
from
author
inner join
book on book.author_id = author.id
order by author.id, book.id desc
) authors_with_first_book
order by authors_with_first_book.name
Another solution is to use a window function as in Lennart's answer. And another very generic one is this
select
book.id, author.id, author.name, book.title as last_book
from
book
inner join
(
select author.id as author_id, max(book.id) as book_id
from
author
inner join
book on author.id = book.author_id
group by author.id
) s
on s.book_id = book.id
inner join
author on book.author_id = author.id
You could add a rule into the join for specifying only one row. I had work for me.
Like this:
SELECT
book.id,
author.id,
author.name,
book.title as last_book
FROM author auth1
JOIN book book ON (book.author_id = auth1.id AND book.id = (select max(b.id) from book b where b.author_id = auth1))
GROUP BY auth1.id
ORDER BY book.id ASC
This way you get the data from the book with the higher ID. You could add "date" and make the same with the max(date).
create temp table book_1 as (
SELECT
id
,title
,author_id
,row_number() OVER (PARTITION BY id) as rownum
FROM
book) distributed by ( id );
select author.id,b.id, author.id, author.name, b.title as last_book
from
author
left join
(select * from book_1 where rownum = 1 ) b on b.author_id = author.id
order by author.id, b.id desc