I\'m having trouble selecting into an ARRAY of STRUCTS in Hive.
My source table looks like this:
+-------------+--+
| field |
+-------------+--
The functionality that you seem to be looking for is to collect the structs into an array. Hive comes with two functions for collecting things into arrays: collect_set and collect_list. However, those functions only work to create arrays of basic types.
The jar for the brickhouse project (https://github.com/klout/brickhouse/wiki/Downloads) provides a number of features, including the ability to collect complex types.
add jar hdfs://path/to/your/jars/brickhouse-0.6.0.jar
Then you can add the collect
function using whatever name you like:
create temporary function collect_struct as 'brickhouse.udf.collect.CollectUDAF';
The following query:
select id
, collect_struct(
named_struct(
"field_id", fieldid,
"field_label", fieldlabel,
"field_type", fieldtype,
"answer_id", answer_id)) as answers
, unitname
from new_answers
group by id, unitname
;
Provides the following result:
id answers unitname
1 [{"field_id":175877,"field_label":"Comment","field_type":"COMMENT","answer_id":8990947803}] Location1
2 [{"field_id":47824,"field_label":"Language","field_type":"MULTIPLE_CHOICE","answer_id":8990950069},{"field_id":48187,"field_label":"Language Type","field_type":"MULTIPLE_CHOICE","answer_id":8990950070},{"field_id":47829,"field_label":"Trans #","field_type":"TEXT","answer_id":8990950071}] Location2