Find rows that have same value in one column and other values in another column?

ぃ、小莉子 提交于 2019-12-04 07:10:41
Erwin Brandstetter

This is a case of relational division. We have assembled an arsenal of techniques under this related question:

The special difficulty is to exclude additional users. There are basically 4 techniques.

I suggest LEFT JOIN / IS NULL:

SELECT cu1.conversation_id
FROM        conversation_user cu1
JOIN        conversation_user cu2 USING (conversation_id)
LEFT   JOIN conversation_user cu3 ON cu3.conversation_id = cu1.conversation_id
                                 AND cu3.user_id NOT IN (3,32)
WHERE  cu1.user_id = 32
AND    cu2.user_id = 3
AND    cu3.conversation_id IS NULL;

Or NOT EXISTS:

SELECT cu1.conversation_id
FROM   conversation_user cu1
JOIN   conversation_user cu2 USING (conversation_id)
WHERE  cu1.user_id = 32
AND    cu2.user_id = 3
AND NOT EXISTS (
   SELECT 1
   FROM   conversation_user cu3
   WHERE  cu3.conversation_id = cu1.conversation_id
   AND    cu3.user_id NOT IN (3,32)
   );

Both queries do not depend on a UNIQUE constraint for (conversation_id, user_id), which may or may not be in place. Meaning, the query even works if user_id 32 (or 3) is listed more than once for the same conversation. You would get duplicate rows in the result, though, and need to apply DISTINCT or GROUP BY.
The only condition is the one you formulated:

... a query that would show that there is a conversation between just user 32 and user 3?

Audited query

The query you linked in the comment wouldn't work. You forgot to exclude other participants. Should be something like:

SELECT *  -- or whatever you want to return
FROM   conversation_user cu1
WHERE  cu1.user_id = 32
AND    EXISTS (
   SELECT 1
   FROM   conversation_user cu2
   WHERE  cu2.conversation_id = cu1.conversation_id 
   AND    cu2.user_id = 3
   )
AND NOT EXISTS (
   SELECT 1
   FROM   conversation_user cu3
   WHERE  cu3.conversation_id = cu1.conversation_id
   AND    cu3.user_id NOT IN (3,32)
   );

Which is similar to the other two queries, except that it will not return multiple rows if user_id = 3 is linked multiple times.

You can use conditional aggregation to select all cids that only have 2 specific particpants

select cid from conversation_user
group by cid
having count(*) = 2
and count(case when user_id not in (32,3) then 1 end) = 0

If (cid,user_id) is not unique then replace having count(*) = 2 with having count(distinct user_id) = 2

Because you want conversations with just 2 users, you can use a self outer join on other users and filter out hits:

To find all 2-user conversations and they're between:

SELECT
    a.conversation_id cid,
    a.user_id user_id_1,
    b.user_id user_id_2
FROM conversation_user a
JOIN conversation_user b ON b.cid = a.cid
  AND b.user_id > a.user_id
LEFT JOIN conversation_user c ON c.cid = a.cid
  AND c.user_id NOT IN (a.user_id, b.user_id)
WHERE c.cid IS NULL -- only return misses on join to others

To find all 2-user conversations for a particular user, just add:

AND a.user_id = 32

if you just want confirmation.

 select conversation_id 
   from  conversation_users 
   group by conversation_id
   having bool_and ( user_id in (3,32))
      and count(*) = 2;

if you want full details, you can use a window function and a CTE like this:

 with a as (
   select *
      ,not bool_and( user_id in (3,32) )
         over  ( partition by conversation_id) 
       and 2 = count(user_id)
         over  ( partition by conversation_id)
           as conv_candidates 
   from conversation_users 
   ) 
 select * from a where conv_candidates;
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!