Finding duplicate values in MySQL

前端 未结 25 2159
执笔经年
执笔经年 2020-11-22 04:04

I have a table with a varchar column, and I would like to find all the records that have duplicate values in this column. What is the best query I can use to find the duplic

25条回答
  •  北恋
    北恋 (楼主)
    2020-11-22 04:48

    One very late contribution... in case it helps anyone waaaaaay down the line... I had a task to find matching pairs of transactions (actually both sides of account-to-account transfers) in a banking app, to identify which ones were the 'from' and 'to' for each inter-account-transfer transaction, so we ended up with this:

    SELECT 
        LEAST(primaryid, secondaryid) AS transactionid1,
        GREATEST(primaryid, secondaryid) AS transactionid2
    FROM (
        SELECT table1.transactionid AS primaryid, 
            table2.transactionid AS secondaryid
        FROM financial_transactions table1
        INNER JOIN financial_transactions table2 
        ON table1.accountid = table2.accountid
        AND table1.transactionid <> table2.transactionid 
        AND table1.transactiondate = table2.transactiondate
        AND table1.sourceref = table2.destinationref
        AND table1.amount = (0 - table2.amount)
    ) AS DuplicateResultsTable
    GROUP BY transactionid1
    ORDER BY transactionid1;
    

    The result is that the DuplicateResultsTable provides rows containing matching (i.e. duplicate) transactions, but it also provides the same transaction id's in reverse the second time it matches the same pair, so the outer SELECT is there to group by the first transaction ID, which is done by using LEAST and GREATEST to make sure the two transactionid's are always in the same order in the results, which makes it safe to GROUP by the first one, thus eliminating all the duplicate matches. Ran through nearly a million records and identified 12,000+ matches in just under 2 seconds. Of course the transactionid is the primary index, which really helped.

提交回复
热议问题