问题
I am trying to move bytea
data from one table to another, updating references in one query.
Therefore I would like to return data from the query used for the insert that is not used for the insert.
INSERT INTO file_data (data)
select image from task_log where image is not null
RETURNING id as file_data_id, task_log.id as task_log_id
But I get an error for that query:
[42P01] ERROR: missing FROM-clause entry for table "task_log"
I want to do something like:
WITH inserted AS (
INSERT INTO file_data (data)
SELECT image FROM task_log WHERE image IS NOT NULL
RETURNING id AS file_data_id, task_log.id AS task_log_id
)
UPDATE task_log
SET task_log.attachment_id = inserted.file_data_id,
task_log.attachment_type = 'INLINE_IMAGE'
FROM inserted
WHERE inserted.task_log_id = task_log.id;
But I fail to get all data used for the insert, I can't return the id from the subselect.
I was inspired by this answer on how to do that with Common Table Expressions but I can't find a way to make it work.
回答1:
You need to get your table names and aliases right. Plus, the connection between the two tables is the column image
(data
in the new table file_data
):
WITH inserted AS (
INSERT INTO file_data (data)
SELECT image
FROM task_log
WHERE image IS NOT NULL
RETURNING id, data -- can only reference target row
)
UPDATE task_log t
SET attachment_id = i.id
, attachment_type = 'INLINE_IMAGE'
FROM inserted i
WHERE t.image = i.data;
Like explained in my old answer you referenced, image
must be unique in task_log
for this to work:
- Insert data and set foreign keys with Postgres
I added a technique how to disambiguate non-unique values in the referenced answer. Not sure if you'd want duplicate images in file_data
, though.
In the RETURNING
clause of an INSERT
you can only reference columns from the inserted row. The manual:
The optional
RETURNING
clause causesINSERT
to compute and return value(s) based on each row actually inserted (...) However, any expression using the table's columns is allowed.
Bold emphasis mine.
Fold duplicate source values
If you want distinct entries in the target table of the INSERT
(task_log
), all you need in this case is DISTINCT
in the initial SELECT
:
WITH inserted AS (
INSERT INTO file_data (data)
SELECT DISTINCT image -- fold duplicates
FROM task_log
WHERE image IS NOT NULL
RETURNING id, data -- can only reference target row
)
UPDATE task_log t
SET attachment_id = i.id
, attachment_type = 'INLINE_IMAGE'
FROM inserted i
WHERE t.image = i.data;
The resulting file_data.id
is used multiple times in task_log
. Be aware that multiple rows in task_log
now point to the same image in file_data
. Careful with updates and deletes ...
回答2:
I needed to replicate duplicates so I ended up adding a temp column for the id of the used data row.
alter table file_data add column task_log_id bigint;
-- insert & update data
alter table file_data drop column task_log_id;
The full move script was
-- A new table for any file data
CREATE TABLE file_data (
id BIGSERIAL PRIMARY KEY,
data bytea
);
-- Move data from task_log to bytes
-- Create new columns to reference file_data
alter table task_log add column attachment_type VARCHAR(50);
alter table task_log add column attachment_id bigint REFERENCES file_data;
-- add a temp column for the task_id used for the insert
alter table file_data add column task_log_id bigint;
-- insert data into file_data and set references
with inserted as (
INSERT INTO file_data (data, task_log_id)
select image, id from task_log where image is not null
RETURNING id, task_log_id
)
UPDATE task_log
SET attachment_id = inserted.id,
attachment_type = 'INLINE_IMAGE'
FROM inserted
where inserted.task_log_id = task_log.id;
-- delete the temp column
alter table file_data drop column task_log_id;
-- delete task_log images
alter table task_log drop column image;
As this produces some dead data I ran a vacuum full
afterwards to clean up.
But please let me repeat the warning from @ErwinBrandstetter:
Performance is much worse than for the method using a serial number I proposed in the linked answer. Adding & removing a column require's owner's privileges, a full table rewrite and exclusive locks on the table, which is poison for concurrent access.
来源:https://stackoverflow.com/questions/47202078/return-data-from-subselect-used-in-insert-in-a-common-table-expression