How to use the physical location of rows (ROWID) in a DELETE statement

血红的双手。 提交于 2019-11-30 23:51:01

Simplify this by one query level:

DELETE FROM table_name
WHERE  ctid NOT IN (
   SELECT min(ctid)
   FROM   table_name
   GROUP  BY $other_columns);

.. where duplicates are defined by equality in $other_columns.
There is no need to include columns from the GROUP BY clause in the SELECT list, so you don't need another subquery.

ctid in the current manual.

On PostgreSQL the physical location of the row is called CTID.

So if you want to view it use a QUERY like this:

SELECT CTID FROM table_name

To use it on a DELETE statement to remove the duplicated records use it like this:

DELETE FROM table_name WHERE CTID NOT IN (
  SELECT RECID FROM 
    (SELECT MIN(CTID) AS RECID, other_columns 
      FROM table_name GROUP BY other_columns) 
  a);

Remember that table_name is the desired table and other_columns are the columns that you want to use to filter that.

Ie:

DELETE FROM user_department WHERE CTID NOT IN (
  SELECT RECID FROM 
    (SELECT MIN(CTID) AS RECID, ud.user_id, ud.department_id
      FROM user_department ud GROUP BY ud.user_id, ud.department_id) 
  a);

You should consider using row_number() if want to delete based on a unique id column(or a timestamp), since ctid alone is not always reliable when you want to only keep recent records etc.

WITH d 
     AS (SELECT ctid c, 
                row_number() 
                  OVER ( 
                    partition BY s 
                    ORDER BY id) rn 
         FROM   t) 
DELETE FROM t 
WHERE  ctid IN (SELECT c 
               FROM   d 
               WHERE  rn > 1)  ; 

Demo

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!