search for cross-field duplicates in postgresql

梦想与她 提交于 2021-02-10 21:21:53

问题


I have a table of contacts. The table contains a mobile_phone column as well as a home_phone column. I'd like to fetch all duplicate contacts where a duplicate is two contacts sharing a phone number. If contact A's mobile_phone matches contact B's home_phone, this is also a duplicate. Here is an example of three contacts that should match.

contact_id|mobile_phone|home_phone|other columns such as email.......|...
-------------------------------------------------------------------------
111       |9748777777  |1112312312|..................................|...
112       |1112312312  |null      |..................................|...
113       |9748777777  |0001112222|..................................|...

Here is a query that finds duplicates, just not across fields.

select mobile_phone from contacts group by mobile_phone HAVING count(*) > 1 order by mobile_phone

回答1:


To find all rows with (cross-column) duplicate phone numbers:

SELECT *
FROM   contacts c
WHERE  EXISTS (
   SELECT FROM contacts x
   WHERE  x.mobile_phone IN (c.mobile_phone, c.home_phone)
       OR x.home_phone   IN (c.mobile_phone, c.home_phone)
   AND x.contact_id <> c.contact_id  -- except self
   );

To find all duplicative phone numbers across the two columns:

SELECT DISTINCT phone
FROM  (
   SELECT mobile_phone AS phone
   FROM   contacts c
   WHERE  EXISTS (
      SELECT FROM mobile_phone x
      WHERE  c.mobile_phone IN (x.mobile_phone, x.home_phone)
      AND    c.contact_id <> x.contact_id  -- except self
      )
   UNION ALL
   SELECT home_phone
   FROM   contacts c
   WHERE  EXISTS (
      SELECT FROM mobile_phone x
      WHERE  c.home_phone = x.home_phone   -- cross-over covered by 1s SELECT
      AND    c.contact_id <> x.contact_id  -- except self
      )
   ) sub;

Repeating the same number in both columns of the same row does not qualify. I don't think you'd want to include those. (Would still be noise that might be worth disallowing with a CHECK constraint.)




回答2:


Use window functions:

select c.*
from (select c.*, count(*) over (partition by mobile_phone) as cnt
      from contacts c
     ) c
where cnt > 1;


来源:https://stackoverflow.com/questions/63275516/search-for-cross-field-duplicates-in-postgresql

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!