remove surrounding quotes from fields while loading data into hive

前端 未结 2 413
一生所求
一生所求 2020-12-21 08:14

I want to load a table with input data into hive. I have data in the following format.

\"153662\";\"0002241447\";\"0\"
\"153662\";\"000647036X\";\"0\"
\"153         


        
相关标签:
2条回答
  • 2020-12-21 08:22

    You will have to use Csv-Serde for this.

    CREATE TABLE Table(A varchar(50),B varchar(50),C varchar(50))
    ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.OpenCSVSerde'
    WITH SERDEPROPERTIES 
    (
        "separatorChar" = ";",
        "quoteChar"     = "\""
    )  
    STORED AS TEXTFILE;
    
    0 讨论(0)
  • 2020-12-21 08:49

    Multiple ways to achieve this:

    1. Use CSV serde
    2. Use regex serde- regex "\"(.*)\"\;\"(.*)\"\;\"(.*)\""
    3. Load data to external table then remove double quotes:

    CREATE EXTERNAL TABLE source( a string, b String, c String) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\;' LOCATION 'xyz';

    CREATE TABLE destination AS SELECT REGEXP_REPLACE(a,'"',''), REGEXP_REPLACE(b,'"',''), CAST ( REGEXP_REPLACE(c,'"','') AS BIGINT) FROM source;

    0 讨论(0)
提交回复
热议问题