How to handle fields enclosed within quotes(CSV) in importing data from S3 into DynamoDB using EMR/Hive

前端 未结 7 1068
梦毁少年i
梦毁少年i 2020-12-28 17:14

I am trying to use EMR/Hive to import data from S3 into DynamoDB. My CSV file has fields which are enclosed within double quotes and separated by comma. While creating exter

7条回答
  •  小蘑菇
    小蘑菇 (楼主)
    2020-12-28 17:59

    Use the csv-serde-0.9.1.jar file in your hive query, see http://illyayalovyy.github.io/csv-serde/

    add jar /path/to/jar_file
    
    Create external table emrS3_import_1(col1 string, col2 string, col3 string, col4 string) row format serde 'com.bizo.hive.serde.csv.CSVSerde'
    with serdeproperties
    (
      "separatorChar" = "\;",
      "quoteChar" = "\"
    ) stored as textfile
    tblproperties("skip.header.line.count"="1") ---to skip if have any header file
    LOCATION 's3://emrTest/folder';
    

提交回复
热议问题