Hive: parsing JSON

前端 未结 4 2051
遥遥无期
遥遥无期 2021-02-04 04:11

I am trying to get some values out of nested JSON for millions of rows (5 TB+ table). What is the most efficient way to do this?

Here is an example:

{\"c         


        
4条回答
  •  遇见更好的自我
    2021-02-04 04:45

    You can use get_json_object:

     select get_json_object(fieldname, '$.country'), 
            get_json_object(fieldname, '$.data.ad.s') from ... 
    

    You will get better performance with json_tuple but I found a "how to" to get the values in json inside json; To formating your table you can use something like this:

    from table t lateral view explode( split(regexp_replace(get_json_object(ln, ''$.data.ad.s'), '\\[|\\]', ''), ',' ) ) tb1 as s this code above will transform you "Array" in a column.

    form more: https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF

    I hope this help ...

提交回复
热议问题