I have a table with 2 fields (name, interest) and I want to find all pairs that have the same interest, with all duplicates and mirrored pairs removed.
I am able to find
Assuming you do not care which pair ends up sticking around (ben,will) vs (will, ben), then my preferred solution is to do the following:
DELETE p2
FROM Pairs p1
INNER JOIN Pairs p2
on p1.Name1 = p2.Name2
and p1.Name2 = p2.Name1
and p1.Interest = p2.Interest
-- match only one of the two pairs
and p1.Name1 > p1.Name2
By virtue of the fact that you would never have Name1 and Name2 equal, there must always be one pair where the first member is less than the second member. Using that relationship, we can delete the duplicate.
This is especially trivial if you have a surrogate key for the relationship, as then the requirement for Name1 and Name2 to be unequal goes away.
Edit: if you don't want to remove them from the table, but just from the results of a specific query, use the same pattern with SELECT
rather than DELETE
.
I had similar problem and figure out studying the first answer that the query below will do the trick
SELECT P1.name AS name1,P2.name AS name2,P1.interest
FROM Table AS P1,Table AS P2
WHERE P1.interest=P2.interest AND P1.name>P2.name
Suppose we have table Name
with tuples:
F1 F2
Jon Smith
Smith Jon
then to remove this pair we can make query like this:
SELECT n1.F1, n1.F2
FROM Name n1
WHERE n1.F1 > (SELECT n2.F1
FROM Name n2
WHERE n1.F1=n2.F2)
So Instead of using <>
in
(SELECT * FROM Matches
WHERE name2 **<>** (select name1 from Matches);)
use >
or <
operator and It should work fine.