问题
I'm using the following SQL to identify duplicates in the table 'transaction_list'. This works perfectly. Now I want to delete all duplicates from that table based on these criteria and leave only the latest entries. These can be identified by the column 'last_update'. I tried different DELETE statements but it didn't work. Any suggestions are highly appreciated.
SELECT par_num
,tran_num
,COUNT(*) AS num_duplicates
FROM transaction_list
WHERE last_update >= to_date('01-mar-2020 00:00:00', 'dd-mon-yyyy
hh24:mi:ss')
GROUP BY par_num
,tran_num
HAVING COUNT(*) > 1
ORDER BY par_num
回答1:
Here is an approach using the row ids:
delete from transaction
where
last_update = date '2020-03-01'
and rowid in (
select rid
from (
select
rowid rid,
row_number() over(partition by par_num ,tran_num order by last_update desc) rn
from transaction
) t
where rn > 1
)
The subquery generates the list of row ids for rows that are not the latest in their group (ie all records having the same par_num ,tran_num
) - the most recent record per group is identified using row_number()
. The outer query just deletes those rows.
回答2:
If the idea is to delete all par_num
/ tran_num
duplicates except the last in each set ordered by last_update
, then this should do it:
delete transaction_list
where rowid in
( select lag(rowid)
over (partition by par_num, tran_num order by last_update)
from transaction_list );
DBFiddle
Explanation: lag returns a value from the previous row (or another earlier row - you can specify all kinds of offset logic if you want, but here we just want the previous row). The over() clause specifies the ordering and windowing. In this case, we want to order each set of par_num
/ tran_num
combinations by last_update
and delete the previous row. The partition by
section means the ordering resets for each par_num
/ tran_num
combination, so each group has a 'last' row that won't be deleted.
来源:https://stackoverflow.com/questions/60836201/delete-duplicate-rows-in-oracle-sql-leaving-the-latest-entries