问题
I am trying to export my hive table as a csv in beeline hive. When I run the command !sql select * from database1 > /user/bob/output.csv
it gives me syntax error.
I have successfully connected to the database at this point using the below command. The query outputs the correct results on console.
beeline -u 'jdbc:hive2://[databaseaddress]' --outputformat=csv
Also, not very clear where the file ends up. It should be the file path in hdfs correct?
回答1:
When hive version is at least 0.11.0 you can execute:
INSERT OVERWRITE LOCAL DIRECTORY '/tmp/directoryWhereToStoreData'
ROW FORMAT DELIMITED
FIELDS TERMINATED BY ','
LINES TERMINATED BY "\n"
SELECT * FROM yourTable;
from hive/beeline to store the table into a directory on the local filesystem.
Alternatively, with beeline, save your SELECT query in yourSQLFile.sql and run:
beeline -u 'jdbc:hive2://[databaseaddress]' --outputformat=csv2 -f yourSQlFile.sql > theFileWhereToStoreTheData.csv
Also this will store the result into a file in the local file system.
From hive, to store the data somewhere into HDFS:
CREATE EXTERNAL TABLE output
LIKE yourTable
ROW FORMAT DELIMITED
FIELDS TERMINATED BY ','
LINES TERMINATED BY '\n'
LOCATION 'hfds://WhereDoYou/Like';
INSERT OVERWRITE TABLE output SELECT * from yourTable;
then you can collect the data to a local file using:
hdfs dfs -getmerge /WhereDoYou/Like
This is another option to get the data using beeline only:
env HADOOP_CLIENT_OPTS="-Ddisable.quoting.for.sv=false" beeline -u "jdbc:hive2://your.hive.server.address:10000/" --incremental=true --outputformat=csv2 -e "select * from youdatabase.yourtable"
Working on:
Connected to: Apache Hive (version 1.1.0-cdh5.10.1)
Driver: Hive JDBC (version 1.1.0-cdh5.10.1)
Transaction isolation: TRANSACTION_REPEATABLE_READ
Beeline version 1.1.0-cdh5.10.1 by Apache Hive
回答2:
You can use this command to save output in CSV format from beeline
:
beeline -u 'jdbc:hive2://bigdataplatform-dev.nam.nsroot.net:10000/;principal=hive/bigdataplatform-dev.net@NAMUXDEV.NET;ssl=true' --outputformat=csv2 --verbose=false --fastConnect=true --silent=true -f $query_file>out.csv
Save your SQL query file into $query_file.
Result will be in out.csv.
I have complete eample here: hivehoney
回答3:
Following worked for me
hive --silent=true --verbose=false --outputformat=csv2 -e "use <db_name>; select * from <table_name>" > table_name.csv
One advantage over using beeline is that you don't have have to provide hostname or user/pwd if you are running on hive node.
When some of the columns have string values having commas, tsv (tab separated) works better
hive --silent=true --verbose=false --outputformat=tsv -e "use <db_name>; select * from <table_name>" > table_name.tsv
来源:https://stackoverflow.com/questions/39577036/export-as-csv-in-beeline-hive