Loading JSON file with serde in Cloudera

前端 未结 2 1761
长发绾君心
长发绾君心 2021-01-25 21:03

I am trying to work with a JSON file with this bag structure :

{
   \"user_id\": \"kim95\",
   \"type\": \"Book\",
   \"title\": \"Modern Database Systems: The O         


        
2条回答
  •  野的像风
    2021-01-25 21:54

    Hive does not have built in support for JSON. So for using JSON with Hive we need to use third part jars like: https://github.com/rcongiu/Hive-JSON-Serde

    You have couple of issues with the create table statement. It should look like this:

    CREATE EXTERNAL TABLE IF NOT EXISTS serd ( 
    user_id string,type string,title string,year string,publisher string,authors array,source:string)
    ROW FORMAT SERDE 'org.openx.data.jsonserde.JsonSerDe'
    LOCATION...
    

    The JSON records your are using keep each record in a single line like this:

    {"user_id": "kim95", "type": "Book", "title": "Modern Database Systems: The Object Model, Interoperability, and Beyond.", "year": "1995", "publisher": "ACM Press and Addison-Wesley", "authors": [{"name":"null"}], "source": "DBLP"} 
    {"user_id": "marshallo79", "type": "Book", "title": "Inequalities: Theory of Majorization and Its Application.", "year": "1979", "publisher": "Academic Press","authors": [{"name":"Albert W. Marshall"},{"name":"Ingram Olkin"}], "source": "DBLP"}
    

    After downloading the project from GIT you need to compile the the project which will create a jar you need to add this jar in the Hive session before running the create table statement.

    Hope it helps...!!!

提交回复
热议问题