How do I output the results of a HiveQL query to CSV?

前端 未结 18 1599
独厮守ぢ
独厮守ぢ 2020-11-27 10:11

we would like to put the results of a Hive query to a CSV file. I thought the command should look like this:

insert overwrite directory \'/home/output.csv\'          


        
相关标签:
18条回答
  • 2020-11-27 10:53

    This shell command prints the output format in csv to output.txt without the column headers.

    $ hive --outputformat=csv2 -f 'hivedatascript.hql' --hiveconf hive.cli.print.header=false > output.txt
    
    0 讨论(0)
  • 2020-11-27 10:54

    Just to cover more following steps after kicking off the query: INSERT OVERWRITE LOCAL DIRECTORY '/home/lvermeer/temp' ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' select books from table;

    In my case, the generated data under temp folder is in deflate format, and it looks like this:

    $ ls
    000000_0.deflate  
    000001_0.deflate  
    000002_0.deflate  
    000003_0.deflate  
    000004_0.deflate  
    000005_0.deflate  
    000006_0.deflate  
    000007_0.deflate
    

    Here's the command to unzip the deflate files and put everything into one csv file:

    hadoop fs -text "file:///home/lvermeer/temp/*" > /home/lvermeer/result.csv
    
    0 讨论(0)
  • 2020-11-27 10:55
    hive  --outputformat=csv2 -e "select * from yourtable" > my_file.csv
    

    or

    hive  --outputformat=csv2 -e "select * from yourtable" > [your_path]/file_name.csv
    

    For tsv, just change csv to tsv in the above queries and run your queries

    0 讨论(0)
  • 2020-11-27 10:59

    I tried various options, but this would be one of the simplest solution for Python Pandas:

    hive -e 'select books from table' | grep "|" ' > temp.csv
    
    df=pd.read_csv("temp.csv",sep='|')
    

    You can also use tr "|" "," to convert "|" to ","

    0 讨论(0)
  • 2020-11-27 11:01

    If you want a CSV file then you can modify Lukas' solutions as follows (assuming you are on a linux box):

    hive -e 'select books from table' | sed 's/[[:space:]]\+/,/g' > /home/lvermeer/temp.csv
    
    0 讨论(0)
  • 2020-11-27 11:01

    If you are using HUE this is fairly simple as well. Simply go to the Hive editor in HUE, execute your hive query, then save the result file locally as XLS or CSV, or you can save the result file to HDFS.

    0 讨论(0)
提交回复
热议问题