psql import .csv - Double Quoted fields and Single Double Quote Values

≡放荡痞女 提交于 2021-02-11 17:39:37

问题


Hello Stack Overflowers,

Weird question. I am having trouble importing a .csv file using psql command line arguments...

The .csv is comma delimited and there are double quotes around cells/fields that have commas in them. I run into an issue where one of the cells/fields has a single double-quote that is being used for inches. So in the example below, it thinks the bottom two rows are all one cell/field.

I can't seem to find a way to make this import correctly. I am hoping to not have to make changes to the file itself and just adjust my psql command.

Ex:
number, number, description  (Headers)
123,124,"description, description"
123,124,description, TV 55"
123,124,description, TV 50"

Command Ex:
\copy table FROM 'C:\Users\Desktop\folder\file.csv' CSV HEADER
\copy table FROM 'C:\Users\Desktop\folder\file.csv' WITH CSV HEADER QUOTE '"' ESCAPE '\' 

I've noticed saving using excel fixes the issue... Excel formats the records like...

number, number, description  (Headers)
123,124,"description, description"
123,124,"description, TV 55"""
123,124,"description, TV 50"""

I don't want to save using excel though because I have numbers that are turned into scientific notation and leading zeros are dropped immediately upon opening the file in excel.


回答1:


It's an ugly hack, but you can import into a single-column table with \copy table from '/path/to/file' CSV quote e'\x01' delimiter e'\x02' and then try to fix it in SQL with regex functions. This is only workable with reasonably small CSVs since you're duplicating the data in the single-column table while doing the import.

testdb=# create table import_data(t text);
CREATE TABLE
testdb=# \! cat /tmp/oof.csv
num0,num1,descrip
123,124,"description, description"
123,124,description, TV 55"
123,124,"description, TV 50""
testdb=# \copy import_data from /tmp/oof.csv csv header quote e'\x01' delimiter e'\x02'
COPY 3
testdb=# CREATE TABLE fixed AS
SELECT
  (regexp_split_to_array(t, ','))[1] num1,
  (regexp_split_to_array(t, ','))[2] num2,
  regexp_replace(
        regexp_replace(regexp_replace(t, '([^,]+,[^,]+),(.*)', '\2'),
                       '"(.*?)"', '\1'),
        '(.*)(")?', '\1\2') as descrip
FROM import_data;
SELECT 3
testdb=# select * from fixed;
 num1 | num2 |         descrip          
------+------+--------------------------
 123  | 124  | description, description
 123  | 124  | description, TV 55"
 123  | 124  | description, TV 50"
(3 rows)


来源:https://stackoverflow.com/questions/60119223/psql-import-csv-double-quoted-fields-and-single-double-quote-values

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!