How to delete duplicates in SQL table based on multiple fields

后端 未结 9 448
星月不相逢
星月不相逢 2020-12-04 14:58

I have a table of games, which is described as follows:

+---------------+-------------+------+-----+---------+----------------+
| Field         | Type                


        
相关标签:
9条回答
  • 2020-12-04 15:20
    DELETE FROM tbl
     USING tbl, tbl t2
     WHERE tbl.id > t2.id
      AND t2.field = tbl.field;
    

    in your case:

    DELETE FROM games
     USING games tbl, games t2
     WHERE tbl.id > t2.id
      AND t2.date = tbl.date
      AND t2.time = tbl.time
      AND t2.hometeam_id = tbl.hometeam_id
      AND t2.awayteam_id = tbl.awayteam_id
      AND t2.locationcity = tbl.locationcity
      AND t2.locationstate = tbl.locationstate;
    

    reference: https://dev.mysql.com/doc/refman/5.7/en/delete.html

    0 讨论(0)
  • 2020-12-04 15:21

    You can try such query:

    DELETE FROM table_name AS t1
    WHERE EXISTS (
     SELECT 1 FROM table_name AS t2 
     WHERE t2.date = t1.date 
     AND t2.time = t1.time 
     AND t2.hometeam_id = t1.hometeam_id 
     AND t2.awayteam_id = t1.awayteam_id 
     AND t2.locationcity = t1.locationcity 
     AND t2.id > t1.id )
    

    This will leave in database only one example of each game instance which has the smallest id.

    0 讨论(0)
  • 2020-12-04 15:21

    AS long as you are not getting id (primary key) of the table in your select query and the other data is exact same you can use SELECT DISTINCT to avoid getting duplicate results.

    0 讨论(0)
  • 2020-12-04 15:24

    To get list of duplicate entried matching two fields

    select t.ID, t.field1, t.field2
    from (
      select field1, field2
      from table_name
      group by field1, field2
      having count(*) > 1) x, table_name t
    where x.field1 = t.field1 and x.field2 = t.field2
    order by t.field1, t.field2
    

    And to delete all the duplicate only

    DELETE x 
    FROM table_name x
    JOIN table_name y
    ON y.field1= x.field1
    AND y.field2 = x.field2
    AND y.id < x.id;
    
    0 讨论(0)
  • 2020-12-04 15:26
    select orig.id,
           dupl.id
    from   games   orig, 
           games   dupl
    where  orig.date   =    dupl.date
    and    orig.time   =    dupl.time
    and    orig.hometeam_id = dupl.hometeam_id
    and    orig. awayteam_id = dupl.awayeam_id
    and    orig.locationcity = dupl.locationcity
    and    orig.locationstate = dupl.locationstate
    and    orig.id     <    dupl.id
    

    this should give you the duplicates; you can use it as a subquery to specify IDs to delete.

    0 讨论(0)
  • 2020-12-04 15:28

    You should be able to do a correlated subquery to delete the data. Find all rows that are duplicates and delete all but the one with the smallest id. For MYSQL, an inner join (functional equivalent of EXISTS) needs to be used, like so:

    delete games from games inner join 
        (select  min(id) minid, date, time,
                 hometeam_id, awayteam_id, locationcity, locationstate
         from games 
         group by date, time, hometeam_id, 
                  awayteam_id, locationcity, locationstate
         having count(1) > 1) as duplicates
       on (duplicates.date = games.date
       and duplicates.time = games.time
       and duplicates.hometeam_id = games.hometeam_id
       and duplicates.awayteam_id = games.awayteam_id
       and duplicates.locationcity = games.locationcity
       and duplicates.locationstate = games.locationstate
       and duplicates.minid <> games.id)
    

    To test, replace delete games from games with select * from games. Don't just run a delete on your DB :-)

    0 讨论(0)
提交回复
热议问题