I have a CSV file with some integer column, now it \'s saved as \"\" (empty string).
I want to COPY them to a table as NULL value.
With JAVA code, I have try
Since Postgres 9.4 you now have the ability to use FORCE NULL. This causes the empty string to be converted into a NULL. Very handy, especially with CSV files.
The syntax is as follow: COPY table FROM '/path/to/file.csv' WITH (FORMAT CSV, DELIMITER ';', FORCE_NULL (columnname));
Further details are explained in the documentation: https://www.postgresql.org/docs/current/sql-copy.html
I assume you are aware that numeric data types have no concept of "empty string" (''
) . It's either a number or NULL (or 'NaN' for numeric
- but not for integer
et al.)
Looks like you exported from a string data type like text
and had some actual empty string in there - which are now represented as ""
- "
being the default QUOTE
character in CSV format.
NULL would be represented by nothing, not even quotes. The manual:
NULL
Specifies the string that represents a null value. The default is
\N
(backslash-N) in text format, and an unquoted empty string in CSV format.
You cannot define ""
to generally represent NULL
since that already represents an empty string. Would be ambiguous.
To fix, I see two options:
Edit the CSV file / stream before feeding to COPY
and replace "" with nothing. Might be tricky if you have actual empty string in there as well - or ""
escaping literal "
inside strings.
(What I would do.) Import to an auxiliary temporary table with identical structure except for the integer
column converted to text
. Then INSERT
(or UPSERT?) to the target table from there, converting the integer
value properly on the fly:
-- empty temp table with identical structure
CREATE TEMP TABLE tbl_tmp AS TABLE tbl LIMIT 0;
-- ... except for the int / text column
ALTER TABLE tbl_tmp ALTER col_int TYPE text;
COPY tbl_tmp ...;
INSERT INTO tbl -- identical number and names of columns guaranteed
SELECT col1, col2, NULLIF(col_int, '')::int -- list all columns in order here
FROM tbl_tmp;
Temporary tables are dropped at the end of the session automatically. If you run this multiple times in the same session, either just truncate the existing temp table or drop it after each transaction.
Related: