Delete duplicate rows in Oracle SQL, leaving the latest entries

问题

I'm using the following SQL to identify duplicates in the table 'transaction_list'. This works perfectly. Now I want to delete all duplicates from that table based on these criteria and leave only the latest entries. These can be identified by the column 'last_update'. I tried different DELETE statements but it didn't work. Any suggestions are highly appreciated.

SELECT par_num
,tran_num
,COUNT(*) AS num_duplicates
FROM transaction_list
WHERE last_update >= to_date('01-mar-2020 00:00:00', 'dd-mon-yyyy 
hh24:mi:ss')
GROUP BY par_num
,tran_num
HAVING COUNT(*) > 1
ORDER BY par_num

回答1:

Here is an approach using the row ids:

delete from transaction
where 
    last_update = date '2020-03-01'
    and rowid in (
        select rid
        from (
            select 
                rowid rid, 
                row_number() over(partition by par_num ,tran_num order by last_update desc) rn
            from transaction                
        ) t
        where rn > 1
    )

The subquery generates the list of row ids for rows that are not the latest in their group (ie all records having the same par_num ,tran_num) - the most recent record per group is identified using row_number(). The outer query just deletes those rows.

回答2:

If the idea is to delete all par_num / tran_num duplicates except the last in each set ordered by last_update, then this should do it:

delete transaction_list
where  rowid in
       ( select lag(rowid)
                over (partition by par_num, tran_num order by last_update)
         from   transaction_list );

DBFiddle

Explanation: lag returns a value from the previous row (or another earlier row - you can specify all kinds of offset logic if you want, but here we just want the previous row). The over() clause specifies the ordering and windowing. In this case, we want to order each set of par_num / tran_num combinations by last_update and delete the previous row. The partition by section means the ordering resets for each par_num / tran_num combination, so each group has a 'last' row that won't be deleted.

来源：https://stackoverflow.com/questions/60836201/delete-duplicate-rows-in-oracle-sql-leaving-the-latest-entries

标签

sql

Oracle

duplicates

sql-delete