I\'m testing something in Oracle and populated a table with some sample data, but in the process I accidentally loaded duplicate records, so now I can\'t create a primary ke
1. solution
delete from emp
where rowid not in
(select max(rowid) from emp group by empno);
2. sloution
delete from emp where rowid in
(
select rid from
(
select rowid rid,
row_number() over(partition by empno order by empno) rn
from emp
)
where rn > 1
);
3.solution
delete from emp e1
where rowid not in
(select max(rowid) from emp e2
where e1.empno = e2.empno );
4. solution
delete from emp where rowid in
(
select rid from
(
select rowid rid,
dense_rank() over(partition by empno order by rowid
) rn
from emp
)
where rn > 1
);
The Fastest way for really big tables
Create exception table with structure below: exceptions_table
ROW_ID ROWID
OWNER VARCHAR2(30)
TABLE_NAME VARCHAR2(30)
CONSTRAINT VARCHAR2(30)
Try create a unique constraint or primary key which will be violated by the duplicates. You will get an error message because you have duplicates. The exceptions table will contain the rowids for the duplicate rows.
alter table add constraint
unique --or primary key
(dupfield1,dupfield2) exceptions into exceptions_table;
Join your table with exceptions_table by rowid and delete dups
delete original_dups where rowid in (select ROW_ID from exceptions_table);
If the amount of rows to delete is big, then create a new table (with all grants and indexes) anti-joining with exceptions_table by rowid and rename the original table into original_dups table and rename new_table_with_no_dups into original table
create table new_table_with_no_dups AS (
select field1, field2 ........
from original_dups t1
where not exists ( select null from exceptions_table T2 where t1.rowid = t2.row_id )
)
create table t2 as select distinct * from t1;
5. solution
delete from emp where rowid in
(
select rid from
(
select rowid rid,rank() over (partition by emp_id order by rowid)rn from emp
)
where rn > 1
);
Check below scripts -
1.
Create table test(id int,sal int);
2.
insert into test values(1,100);
insert into test values(1,100);
insert into test values(2,200);
insert into test values(2,200);
insert into test values(3,300);
insert into test values(3,300);
commit;
3.
select * from test;
You will see here 6-records.
4.run below query -
delete from
test
where rowid in
(select rowid from
(select
rowid,
row_number()
over
(partition by id order by sal) dup
from test)
where dup > 1)
select * from test;
You will see that duplicate records have been deleted.
Hope this solves your query.
Thanks :)
For best performance, here is what I wrote :
(see execution plan)
DELETE FROM your_table
WHERE rowid IN
(select t1.rowid from your_table t1
LEFT OUTER JOIN (
SELECT MIN(rowid) as rowid, column1,column2, column3
FROM your_table
GROUP BY column1, column2, column3
) co1 ON (t1.rowid = co1.rowid)
WHERE co1.rowid IS NULL
);