How to load a text file into a Hive table stored as sequence files

前端 未结 3 1597
忘掉有多难
忘掉有多难 2021-01-30 11:11

I have a hive table stored as a sequencefile.

I need to load a text file into this table. How do I load the data into this table?

相关标签:
3条回答
  • 2021-01-30 11:50

    You can load the text file into a textfile Hive table and then insert the data from this table into your sequencefile.

    Start with a tab delimited file:

    % cat /tmp/input.txt
    a       b
    a2      b2
    

    create a sequence file

    hive> create table test_sq(k string, v string) stored as sequencefile;
    

    try to load; as expected, this will fail:

    hive> load data local inpath '/tmp/input.txt' into table test_sq;
    

    But with this table:

    hive> create table test_t(k string, v string) row format delimited fields terminated by '\t' stored as textfile;
    

    The load works just fine:

    hive> load data local inpath '/tmp/input.txt' into table test_t;
    OK
    hive> select * from test_t;
    OK
    a       b
    a2      b2
    

    Now load into the sequence table from the text table:

    insert into table test_sq select * from test_t;
    

    Can also do load/insert with overwrite to replace all.

    0 讨论(0)
  • 2021-01-30 11:50

    You cannot directly create a table stored as a sequence file and insert text into it. You must do this:

    1. Create a table stored as text
    2. Insert the text file into the text table
    3. Do a CTAS to create the table stored as a sequence file.
    4. Drop the text table if desired

    Example:

    CREATE TABLE test_txt(field1 int, field2 string)
    ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t';
    
    LOAD DATA INPATH '/path/to/file.tsv' INTO TABLE test_txt;
    
    CREATE TABLE test STORED AS SEQUENCEFILE
    AS SELECT * FROM test_txt;
    
    DROP TABLE test_txt;
    
    0 讨论(0)
  • 2021-01-30 11:52

    The simple way is to create table as textfile and move the file to the appropriate location

    CREATE EXTERNAL TABLE mytable(col1 string, col2 string)
    row format delimited fields terminated by '|' stored as textfile;

    Copy the file to the HDFS Location where table is created.
    Hope this helps!!!

    0 讨论(0)
提交回复
热议问题