SQL Query: How to get items from one col paired with another but not vice versa

前端 未结 3 1653
死守一世寂寞
死守一世寂寞 2021-01-16 07:50

I need a query that returns all the rows from colA paired with colB but to treat the same values in the opposite direction as duplicates and to be removed.

The best

相关标签:
3条回答
  • 2021-01-16 08:06

    If you prefer a "clean" SQL solution (without least() or greatest()) this also does your job:

    select colA, colB from your_table
    where colA > colB 
      or (colB, colA) not in (select colA, colB from your_table)
    

    SQL fiddle

    0 讨论(0)
  • 2021-01-16 08:12

    Try this:

    select t3.colA,t3.colB
    from table_name t3
    where (t3.colA,t3.colB)
    not in
    (select greatest(t1.colA, t1.colB), least(t1.cola, t1.colB)
    from table_name t1 , table_name t2
    where t1.colB=t2.colA and t1.colA=t2.colB
    group by greatest(t1.colA,t1.colB), least(t1.cola, t1.colB))
    

    SQL FIDDLE

    0 讨论(0)
  • 2021-01-16 08:18

    My SQL has the functions least() and greatest(). A query that returns the unique pairs:

    select least(colA, colB), greatest(cola, colB)
    from t
    group by least(colA, colB), greatest(cola, colB)
    

    However, this could rearrange the values of non-duplicated rows. For instance, if a row were (z, a), the result here would be (a, z).

    To fix this, we need to find the right values. The idea is to count the number of times that each version of the pair appears. If it appears twice, then which is chosen seems to be arbitrary. If once, then we need to get the original row out.

    Here is a version that does this:

    select (case when cnt = 1 then colA else l end) as ColA,
           (case when cnt = 1 then colB else g end) as ColB
    from (select least(colA, colB) as l, greatest(cola, colB) as g,
                 count(distinct colA) as cnt, min(colA) as colA, min(colB) as colB
          from t
           group by least(colA, colB), greatest(cola, colB)
         ) t
    

    What is this doing? The original query finds the unique pairs. In the subquery, it counts the number of times that each version of the pair appears in the data and it includes colA and colB as columns.

    The outer query then chooses what to show for each identified pair. If the count is 1 -- only one version of the pair -- then min(colA) is ColA, and min(colB) is colB. So, use those. Otherwise, it arbitrarily chooses the pair where ColA < ColB.

    0 讨论(0)
提交回复
热议问题