问题
I'm trying to export data from Green-plum to a text file(client) with pipe delimiter using PSQL and \copy. In the output i see single slash is converted to double slash and tab is converted \t. Example N\A is converted to N\\A
So how to get just N\A instead N\\A and just spaces instead of \t ?
Note: i`m allowed to use only \copy. Since my file is huge im getting space issue while use SED or Perl for find and replace
回答1:
Assuming you don't have any "^" characters, you could use that as the escape character.
copy tpcds.call_center to stdout with delimiter '|' escape '^';
More on copy can be found here: https://www.postgresql.org/docs/8.2/static/sql-copy.html
This technique will be relatively slow and put a burden on the Master. If you used gpfdist instead, you could leverage the parallelism in the cluster and bypass the master. This solution is ideal for unloading large amounts of data.
First, start the gpfidst process:
[gpadmin@gpdbsne ~]$ gpfdist -p 8888 > gpfdist_8888.log 2>&1 < gpfdist_8888.log &
[1] 2255
Now, you can create the external table.
[gpadmin@gpdbsne ~]$ psql
SET
Timing is on.
psql (8.2.15)
Type "help" for help.
gpadmin=# create writable external table tpcds.et_call_center
(like tpcds.call_center)
location ('gpfdist://gpdbsne:8888/call_center.txt')
format 'text' (delimiter '|' escape '^');
NOTICE: Table doesn't have 'distributed by' clause, defaulting to distribution columns from LIKE table
CREATE EXTERNAL TABLE
Time: 18.681 ms
Now, you insert the data:
gpadmin=# insert into tpcds.et_call_center select * from tpcds.call_center;
INSERT 0 6
Time: 72.653 ms
gpadmin=# \q
Verify:
[gpadmin@gpdbsne ~]$ wc -l call_center.txt
6 call_center.txt
In my example, I used the hostname "gpdbsne" which is accessible to all segments in this cluster. Typically, Greenplum uses a private network for communication between segments so this hostname will need to be connected to the private network.
Since the writable external table is written to with SQL, you can use whatever transformation logic you want in the SQL so you can change tabs to spaces if you want. This eliminates the need for awk or sed for post processing the files. Copy can use SQL too but like I said, it is a slower than using writable external tables.
来源:https://stackoverflow.com/questions/38209612/greenplum-to-file-using-psql