MYSQL: SELECT sum of field values while also SELECTing unique values?

百般思念 提交于 2021-02-19 08:43:19


I'd like to count the number of purchases of each item while also, depending on who's viewing the content, show whether the user has purchased the content. Because the number of items and purchases could become large I'm reluctant to throw in more JOINs to accomplish this because that would seem not performant.

Basically, I'd like to have a did_i_buy field somewhere in the following query without adding another JOIN. Is this possible? Let's say for user_name=tom:

SELECT Items.item_id, item_name, COUNT(purchase_status='bought') as number_bought 
FROM Purchases
JOIN Items ON Purchases.item_id=Items.item_id
GROUP BY Items.item_id

Here's my DB structure:

Table Items
item_id item_name
1           item_1
2           item_2
3           item_3

Table Purchases
item_id  purchase_status    user_name
1           bought          joe
2           bought          joe
1           bought          tom
1           bought          bill

Desired result for tom

item_id item_name number_bought did_i_buy
1        item_1         3        yes
2        item_2         1        no


If I understand correctly, the did_i_buy column means "did Tom buy". You can do that like this:

  COUNT(CASE WHEN purchase_status='bought' THEN 1 END) as number_bought,
  MAX(CASE WHEN purchase_status='bought' AND user_name='Tom' THEN 'yes' ELSE 'no' END) AS did_i_buy
FROM Purchases
JOIN Items ON Purchases.item_id=Items.item_id
GROUP BY Items.item_id

Alternatively (one CASE statement, see comments below)

  COUNT(purchase_status='bought') as number_bought,
  MAX(CASE WHEN user_name='Tom' THEN 'yes' ELSE 'no' END) AS did_i_buy
FROM Purchases
JOIN Items ON Purchases.item_id=Items.item_id
WHERE purchase_status='bought'
GROUP BY Items.item_id

And one more tweak: Because of the WHERE clause, the COUNT is only going to see rows where purchase_status='bought', so the expression checking the status can be left out (the only change from above is in line 4):

  COUNT(*) as number_bought,
  MAX(CASE WHEN user_name='Tom' THEN 'yes' ELSE 'no' END) AS did_i_buy
FROM Purchases
JOIN Items ON Purchases.item_id=Items.item_id
WHERE purchase_status='bought'
GROUP BY Items.item_id


You must (I think) use subqueries. Each request for a count is a separate query, so there is no way to optimize this (except to compress it all into one query with subqueries). There is no special relation between the horizontal data in items with the vertical data in purchases.

Here is an example query to count transactions for users:

SELECT user_id,(SELECT count(*) FROM transactions WHERE buyer_id=u.user_id) as count FROM users as u

I did a comparison with this query versus a similar query of the other JOIN type. The result: 0.0005 for this one vs. 0.0018 Ed Gibbs. However, if sorting of the number_bought (ORDER BY count DESC) is required, the latter query is significantly faster.

