Removing duplicate rows from table in Oracle

前端 未结 22 1538
醉话见心
醉话见心 2020-11-22 12:57

I\'m testing something in Oracle and populated a table with some sample data, but in the process I accidentally loaded duplicate records, so now I can\'t create a primary ke

相关标签:
22条回答
  • 2020-11-22 13:14

    1. solution

    delete from emp
        where rowid not in
        (select max(rowid) from emp group by empno);
    

    2. sloution

    delete from emp where rowid in
                   (
                     select rid from
                      (
                        select rowid rid,
                          row_number() over(partition by empno order by empno) rn
                          from emp
                      )
                    where rn > 1
                   );
    

    3.solution

    delete from emp e1
             where rowid not in
              (select max(rowid) from emp e2
               where e1.empno = e2.empno ); 
    

    4. solution

     delete from emp where rowid in
                (
                 select rid from
                    (
                      select rowid rid,
                      dense_rank() over(partition by empno order by rowid
                    ) rn
                 from emp
                )
     where rn > 1
    );
    
    0 讨论(0)
  • 2020-11-22 13:15

    The Fastest way for really big tables

    1. Create exception table with structure below: exceptions_table

      ROW_ID ROWID
      OWNER VARCHAR2(30)
      TABLE_NAME VARCHAR2(30)
      CONSTRAINT VARCHAR2(30)
      
    2. Try create a unique constraint or primary key which will be violated by the duplicates. You will get an error message because you have duplicates. The exceptions table will contain the rowids for the duplicate rows.

      alter table add constraint
      unique --or primary key
      (dupfield1,dupfield2) exceptions into exceptions_table;
      
    3. Join your table with exceptions_table by rowid and delete dups

      delete original_dups where rowid in (select ROW_ID from exceptions_table);
      
    4. If the amount of rows to delete is big, then create a new table (with all grants and indexes) anti-joining with exceptions_table by rowid and rename the original table into original_dups table and rename new_table_with_no_dups into original table

      create table new_table_with_no_dups AS (
          select field1, field2 ........ 
          from original_dups t1
          where not exists ( select null from exceptions_table T2 where t1.rowid = t2.row_id )
      )
      
    0 讨论(0)
  • 2020-11-22 13:16

    create table t2 as select distinct * from t1;

    0 讨论(0)
  • 2020-11-22 13:16

    5. solution

    delete from emp where rowid in 
        (
          select  rid from
           (
             select rowid rid,rank() over (partition by emp_id order by rowid)rn from emp     
           )
         where rn > 1
        );
    
    0 讨论(0)
  • 2020-11-22 13:17

    Check below scripts -

    1.

    Create table test(id int,sal int); 
    

    2.

        insert into test values(1,100);    
        insert into test values(1,100);    
        insert into test values(2,200);    
        insert into test values(2,200);    
        insert into test values(3,300);    
        insert into test values(3,300);    
        commit;
    

    3.

     select * from test;    
    

    You will see here 6-records.
    4.run below query -

    delete from 
       test
    where rowid in
     (select rowid from 
       (select 
         rowid,
         row_number()
        over 
         (partition by id order by sal) dup
        from test)
      where dup > 1)
    
    1. select * from test;

    You will see that duplicate records have been deleted.
    Hope this solves your query. Thanks :)

    0 讨论(0)
  • 2020-11-22 13:18

    For best performance, here is what I wrote :
    (see execution plan)

    DELETE FROM your_table
    WHERE rowid IN 
      (select t1.rowid from your_table  t1
          LEFT OUTER JOIN (
          SELECT MIN(rowid) as rowid, column1,column2, column3
          FROM your_table 
          GROUP BY column1, column2, column3
      )  co1 ON (t1.rowid = co1.rowid)
      WHERE co1.rowid IS NULL
    );
    
    0 讨论(0)
提交回复
热议问题