Can I automatically create a table in PostgreSQL from a csv file with headers?

前端 未结 8 613
一整个雨季
一整个雨季 2020-11-30 00:22

I\'m running PostgreSQL 9.2.6 on OS X 10.6.8. I would like to import data from a CSV file with column headers into a database. I can do this with the COPY state

相关标签:
8条回答
  • I haven't used it, but pgLoader (https://pgloader.io/) is recommended by the pgfutter developers (see answer above) for more complicated problems. It looks very capable.

    0 讨论(0)
  • 2020-11-30 00:47

    For a single table, I did very simply, quickly and online through one of the many good converters that can be found on the web. Just google convert csv to sql online and choose one.

    0 讨论(0)
  • 2020-11-30 00:50

    There is a very good tool that imports tables into Postgres from a csv file. It is a command-line tool called pgfutter (with binaries for windows, linux, etc.). One of its big advantages is that it recognizes the attribute/column names as well.

    The usage of the tool is simple. For example if you'd like to import myCSVfile.csv:

    pgfutter --db "myDatabase" --port "5432" --user "postgres" --pw "mySecretPassword" csv myCSVfile.csv
    

    This will create a table (called myCSVfile) with the column names taken from the csv file's header. Additionally the data types will be identified from the existing data.

    A few notes: The command pgfutter varies depending on the binary you use, e.g. it could be pgfutter_windows_amd64.exe (rename it if you intend to use this command frequently). The above command has to be executed in a command line window (e.g. in Windows run cmd and ensure pgfutter is accessible). If you'd like to have a different table name add --table "myTable"; to select a particular database schema us --schema "mySchema". In case you are accessing an external database use --host "myHostDomain".

    A more elaborate example of pgfutter to import myFile into myTable is this one:

    pgfutter --host "localhost" --port "5432" --db "myDB" --schema "public" --table "myTable" --user "postgres" --pw "myPwd" csv myFile.csv
    

    Most likely you will change a few data types (from text to numeric) after the import:

    alter table myTable
      alter column myColumn type numeric
        using (trim(myColumn)::numeric)
    
    0 讨论(0)
  • 2020-11-30 00:51

    You can't find anything in the COPY documentation, because COPY cannot create a table for you.
    You need to do that before you can COPY to it.

    0 讨论(0)
  • 2020-11-30 00:52

    There is a second approach, which I found here (from mmatt). Basically you call a function within Postgres (last argument specifies the number of columns).

    select load_csv_file('myTable','C:/MyPath/MyFile.csv',24)
    

    Here is mmatt's function code, which I had to modify slightly, because I am working on the public schema. (copy&paste into PgAdmin SQL Editor and run it to create the function)

    CREATE OR REPLACE FUNCTION load_csv_file(
        target_table text,
        csv_path text,
        col_count integer)
      RETURNS void AS
    $BODY$
    
    declare
    
    iter integer; -- dummy integer to iterate columns with
    col text; -- variable to keep the column name at each iteration
    col_first text; -- first column name, e.g., top left corner on a csv file or spreadsheet
    
    begin
        set schema 'public';
    
        create table temp_table ();
    
        -- add just enough number of columns
        for iter in 1..col_count
        loop
            execute format('alter table temp_table add column col_%s text;', iter);
        end loop;
    
        -- copy the data from csv file
        execute format('copy temp_table from %L with delimiter '','' quote ''"'' csv ', csv_path);
    
        iter := 1;
        col_first := (select col_1 from temp_table limit 1);
    
        -- update the column names based on the first row which has the column names
        for col in execute format('select unnest(string_to_array(trim(temp_table::text, ''()''), '','')) from temp_table where col_1 = %L', col_first)
        loop
            execute format('alter table temp_table rename column col_%s to %s', iter, col);
            iter := iter + 1;
        end loop;
    
        -- delete the columns row
        execute format('delete from temp_table where %s = %L', col_first, col_first);
    
        -- change the temp table name to the name given as parameter, if not blank
        if length(target_table) > 0 then
            execute format('alter table temp_table rename to %I', target_table);
        end if;
    
    end;
    
    $BODY$
      LANGUAGE plpgsql VOLATILE
      COST 100;
    ALTER FUNCTION load_csv_file(text, text, integer)
      OWNER TO postgres;
    

    Note: There is a common issue with importing text files related to encoding. The csv file should be in UTF-8 format. However, sometimes this is not quite achieved by the programs, which try to do the encoding. I have overcome this issue by opening the file in Notepad++ and converting it to ANSI and back to UTF8.

    0 讨论(0)
  • 2020-11-30 00:55

    I achieved it with this steps:

    1. Convert the csv file to utf8
        iconv -f ISO-8859-1 -t UTF-8 file.txt -o file.csv
    
    1. Use this python script to create the sql to create table and copy
    #!/usr/bin/env python3
    import csv, os
    #pip install python-slugify
    from slugify import slugify
    
    origem = 'file.csv'
    destino = 'file.sql'
    arquivo = os.path.abspath(origem)
    
    d = open(destino,'w')
    with open(origem,'r') as f:
    
        header = f.readline().split(';')
        head_cells = []
        for cell in header:
            value = slugify(cell,separator="_")
            if value in head_cells:
                value = value+'_2'
            head_cells.append(value)
        #cabecalho = "{}\n".format(';'.join(campos))
    
        #print(cabecalho)
        fields= []
        for cell in head_cells:
            fields.append(" {} text".format(cell))
        table = origem.split('.')[0]
        sql = "create table {} ( \n {} \n);".format(origem.split('.')[0],",\n".join(fields))
        sql += "\n COPY {} FROM '{}' DELIMITER ';' CSV HEADER;".format(table,arquivo)
    
        print(sql)
        d.write(sql)
    
    

    3.Run the script with

    python3 importar.py
    

    Optional: Edit the sql script to adjust the field types (all are text by default)

    1. Run the sql script. Short for console
    sudo -H -u postgres bash -c "psql mydatabase < file.sql" 
    
    0 讨论(0)
提交回复
热议问题