I have a database of items. Each item is categorized with a category ID from a category table. I am trying to create a page that lists every category, and underneath each
This is the greatest-n-per-group problem, and it's a very common SQL question.
Here's how I solve it with outer joins:
SELECT i1.*
FROM item i1
LEFT OUTER JOIN item i2
ON (i1.category_id = i2.category_id AND i1.item_id < i2.item_id)
GROUP BY i1.item_id
HAVING COUNT(*) < 4
ORDER BY category_id, date_listed;
I'm assuming the primary key of the item
table is item_id
, and that it's a monotonically increasing pseudokey. That is, a greater value in item_id
corresponds to a newer row in item
.
Here's how it works: for each item, there are some number of other items that are newer. For example, there are three items newer than the fourth newest item. There are zero items newer than the very newest item. So we want to compare each item (i1
) to the set of items (i2
) that are newer and have the same category as i1
. If the number of those newer items is less than four, i1
is one of those we include. Otherwise, don't include it.
The beauty of this solution is that it works no matter how many categories you have, and continues working if you change the categories. It also works even if the number of items in some categories is fewer than four.
Another solution that works but relies on the MySQL user-variables feature:
SELECT *
FROM (
SELECT i.*, @r := IF(@g = category_id, @r+1, 1) AS rownum, @g := category_id
FROM (@g:=null, @r:=0) AS _init
CROSS JOIN item i
ORDER BY i.category_id, i.date_listed
) AS t
WHERE t.rownum <= 3;
MySQL 8.0.3 introduced support for SQL standard window functions. Now we can solve this sort of problem the way other RDBMS do:
WITH numbered_item AS (
SELECT *, ROW_NUMBER() OVER (PARTITION BY category_id ORDER BY item_id) AS rownum
FROM item
)
SELECT * FROM numbered_item WHERE rownum <= 4;