Two SQL LEFT JOINS produce incorrect result

后端 未结 3 1755
青春惊慌失措
青春惊慌失措 2020-11-22 08:56

I have 3 tables:

users(id, account_balance)
grocery(user_id, date, amount_paid)
fishmarket(user_id, date, amount_paid)

Both fishmarke

相关标签:
3条回答
  • 2020-11-22 09:17

    For your original query, if you take away the group by to look at the pre-grouped result, you'll see why the counts your were receiving were created.

    Perhaps the following query utilizing subqueries would achieve your intended result:

    SELECT
     t1."id" AS "User ID",
     t1.account_balance AS "Account Balance",
     (SELECT count(*) FROM grocery     t2 ON (t2.user_id=t1."id")) AS "# of grocery visits",
     (SELECT count(*) FROM fishmarket  t3 ON (t3.user_id=t1."id")) AS "# of fishmarket visits"
    FROM users t1
    ORDER BY t1.id
    
    0 讨论(0)
  • 2020-11-22 09:25

    It's because when the user table joins to the grocery table, there are 3 records matched. Then each of those three records matches with the 4 records in fishmarket, producing 12 records. You need subqueries to get what you are looking for.

    0 讨论(0)
  • 2020-11-22 09:35

    Joins are processed left to right (unless parentheses dictate otherwise). If you LEFT JOIN (or just JOIN, similar effect) three groceries to one user you get 3 rows (1 x 3). If you then join 4 fishmarkets for the same user, you get 12 (3 x 4) rows, multiplying the previous count in the result, not adding to it, like you may have hoped for.
    Thereby multiplying the visits for groceries and fishmarkets alike.

    You can make it work like this:

    SELECT u.id
         , u.account_balance
         , g.grocery_visits
         , f.fishmarket_visits
    FROM   users u
    LEFT   JOIN (
       SELECT user_id, count(*) AS grocery_visits
       FROM   grocery
       GROUP  BY user_id
       ) g ON g.user_id = u.id
    LEFT   JOIN (
       SELECT user_id, count(*) AS fishmarket_visits
       FROM   fishmarket
       GROUP  BY user_id
       ) f ON f.user_id = u.id
    ORDER  BY u.id;
    

    To get aggregated values for one or few users, correlated subqueries like @Vince provided are just fine. For a whole table or major parts of it, it is (much) more efficient to aggregate the n-tables and join to the result once. This way, we also do not need another GROUP BY in the outer query.

    grocery_visits and fishmarket_visits are NULL for users without any related entries in the respective tables. If you need 0 instead (or any arbitrary number), use COALESCE:

    SELECT u.id
         , u.account_balance
         , COALESCE(g.grocery_visits   , 0) AS grocery_visits
         , COALESCE(f.fishmarket_visits, 0) AS fishmarket_visits
    FROM   ...
    
    0 讨论(0)
提交回复
热议问题