Hive select data into an array of structs

后端 未结 3 699
暖寄归人
暖寄归人 2020-12-17 04:19

I am trying to figure out a way in Hive to select data from a flat source and output into an array of named struct(s). Here is a example of what I am looking for...

相关标签:
3条回答
  • 2020-12-17 04:57

    You can also use a workaround

    select collect_list(full_name) full_name_list from (
        select 
            concat_ws(',', 
                concat("first_name:",first_name), 
                concat("last_name:",last_name)
                ) full_name, 
            house_id
        from house) a 
    group by house_id
    
    0 讨论(0)
  • 2020-12-17 05:02

    I would use this jar, it is a much better implementation of collect (and takes complex datatypes).

    Query:

    add jar /path/to/jar/brickhouse-0.7.1.jar;
    create temporary function collect as 'brickhouse.udf.collect.CollectUDAF';
    
    select house_id
      , collect(named_struct("first_name", first_name, "last_name", last_name))
    from db.table
    group by house_id
    

    Output:

    1   [{"first_name":"bob","last_name":"jones"}, {"first_name":"jenny","last_name":"jones"}]
    2   [{"first_name":"sally","last_name":"johnson"}]
    3   [{"first_name":"john","last_name":"smith"},{"first_name":"barb","last_name":"smith"}]
    
    0 讨论(0)
  • 2020-12-17 05:05

    You can try it using pyspark or scalaspark.. Spark sql allows both primitive and non primitive datatypes. ie., You can do collect_set( named_struct)

    0 讨论(0)
提交回复
热议问题