I\'m running PostgreSQL 9.2.6 on OS X 10.6.8. I would like to import data from a CSV file with column headers into a database. I can do this with the COPY
state
I haven't used it, but pgLoader (https://pgloader.io/) is recommended by the pgfutter developers (see answer above) for more complicated problems. It looks very capable.
For a single table, I did very simply, quickly and online through one of the many good converters that can be found on the web. Just google convert csv to sql online and choose one.
There is a very good tool that imports tables into Postgres from a csv file. It is a command-line tool called pgfutter (with binaries for windows, linux, etc.). One of its big advantages is that it recognizes the attribute/column names as well.
The usage of the tool is simple. For example if you'd like to import myCSVfile.csv
:
pgfutter --db "myDatabase" --port "5432" --user "postgres" --pw "mySecretPassword" csv myCSVfile.csv
This will create a table (called myCSVfile
) with the column names taken from the csv file's header. Additionally the data types will be identified from the existing data.
A few notes: The command pgfutter
varies depending on the binary you use, e.g. it could be pgfutter_windows_amd64.exe
(rename it if you intend to use this command frequently). The above command has to be executed in a command line window (e.g. in Windows run cmd
and ensure pgfutter
is accessible). If you'd like to have a different table name add --table "myTable"
; to select a particular database schema us --schema "mySchema"
. In case you are accessing an external database use --host "myHostDomain"
.
A more elaborate example of pgfutter
to import myFile
into myTable
is this one:
pgfutter --host "localhost" --port "5432" --db "myDB" --schema "public" --table "myTable" --user "postgres" --pw "myPwd" csv myFile.csv
Most likely you will change a few data types (from text to numeric) after the import:
alter table myTable
alter column myColumn type numeric
using (trim(myColumn)::numeric)
You can't find anything in the COPY documentation, because COPY cannot create a table for you.
You need to do that before you can COPY
to it.
There is a second approach, which I found here (from mmatt). Basically you call a function within Postgres (last argument specifies the number of columns).
select load_csv_file('myTable','C:/MyPath/MyFile.csv',24)
Here is mmatt's function code, which I had to modify slightly, because I am working on the public schema. (copy&paste into PgAdmin SQL Editor and run it to create the function)
CREATE OR REPLACE FUNCTION load_csv_file(
target_table text,
csv_path text,
col_count integer)
RETURNS void AS
$BODY$
declare
iter integer; -- dummy integer to iterate columns with
col text; -- variable to keep the column name at each iteration
col_first text; -- first column name, e.g., top left corner on a csv file or spreadsheet
begin
set schema 'public';
create table temp_table ();
-- add just enough number of columns
for iter in 1..col_count
loop
execute format('alter table temp_table add column col_%s text;', iter);
end loop;
-- copy the data from csv file
execute format('copy temp_table from %L with delimiter '','' quote ''"'' csv ', csv_path);
iter := 1;
col_first := (select col_1 from temp_table limit 1);
-- update the column names based on the first row which has the column names
for col in execute format('select unnest(string_to_array(trim(temp_table::text, ''()''), '','')) from temp_table where col_1 = %L', col_first)
loop
execute format('alter table temp_table rename column col_%s to %s', iter, col);
iter := iter + 1;
end loop;
-- delete the columns row
execute format('delete from temp_table where %s = %L', col_first, col_first);
-- change the temp table name to the name given as parameter, if not blank
if length(target_table) > 0 then
execute format('alter table temp_table rename to %I', target_table);
end if;
end;
$BODY$
LANGUAGE plpgsql VOLATILE
COST 100;
ALTER FUNCTION load_csv_file(text, text, integer)
OWNER TO postgres;
Note: There is a common issue with importing text files related to encoding. The csv file should be in UTF-8 format. However, sometimes this is not quite achieved by the programs, which try to do the encoding. I have overcome this issue by opening the file in Notepad++ and converting it to ANSI and back to UTF8.
I achieved it with this steps:
iconv -f ISO-8859-1 -t UTF-8 file.txt -o file.csv
#!/usr/bin/env python3
import csv, os
#pip install python-slugify
from slugify import slugify
origem = 'file.csv'
destino = 'file.sql'
arquivo = os.path.abspath(origem)
d = open(destino,'w')
with open(origem,'r') as f:
header = f.readline().split(';')
head_cells = []
for cell in header:
value = slugify(cell,separator="_")
if value in head_cells:
value = value+'_2'
head_cells.append(value)
#cabecalho = "{}\n".format(';'.join(campos))
#print(cabecalho)
fields= []
for cell in head_cells:
fields.append(" {} text".format(cell))
table = origem.split('.')[0]
sql = "create table {} ( \n {} \n);".format(origem.split('.')[0],",\n".join(fields))
sql += "\n COPY {} FROM '{}' DELIMITER ';' CSV HEADER;".format(table,arquivo)
print(sql)
d.write(sql)
3.Run the script with
python3 importar.py
Optional: Edit the sql script to adjust the field types (all are text by default)
sudo -H -u postgres bash -c "psql mydatabase < file.sql"