By importing a JSON file with repeated records in BigQuery, you can create a table with nested repeated fields.
With introduction of BigQuery Standard SQL we've got easy way to deal with records
Try below, Don't forget to uncheck Use Legacy SQL
checkbox under Show Options
WITH YourTable AS (
SELECT 'a1' AS item, '2016-03-03 19:52:23 UTC' AS click_time, 'u1' AS userid UNION ALL
SELECT 'a1' AS item, '2016-03-03 19:52:23 UTC' AS click_time, 'u2' AS userid UNION ALL
SELECT 'a1' AS item, '2016-03-03 19:52:23 UTC' AS click_time, 'u3' AS userid UNION ALL
SELECT 'a1' AS item, '2016-03-03 19:52:23 UTC' AS click_time, 'u4' AS userid UNION ALL
SELECT 'a2' AS item, '2016-03-03 19:52:23 UTC' AS click_time, 'u1' AS userid UNION ALL
SELECT 'a2' AS item, '2016-03-03 19:52:23 UTC' AS click_time, 'u2' AS userid
)
SELECT item, ARRAY_AGG(STRUCT(click_time, userid)) AS clicks
FROM YourTable
GROUP BY item
Assume you have flatten data in your table :
item click_time userid
a1 2016-03-03 19:52:23 UTC u1
a1 2016-03-03 19:52:23 UTC u2
a1 2016-03-03 19:52:23 UTC u3
a1 2016-03-03 19:52:23 UTC u4
a2 2016-03-03 19:52:23 UTC u1
a2 2016-03-03 19:52:23 UTC u2
Below GBQ Query does what you ask for :
Please note: you need to write to table with 'Allow Large Result' and 'UnFlatten' options
SELECT *
FROM JS(
( // input table
SELECT item, NEST(CONCAT(STRING(click_time), ',', STRING(userid))) AS clicks
FROM YourTable
GROUP BY item
),
item, clicks, // input columns
"[ // output schema
{'name': 'item', 'type': 'STRING'},
{'name': 'clicks', 'type': 'RECORD',
'mode': 'REPEATED',
'fields': [
{'name': 'click_time', 'type': 'STRING'},
{'name': 'userid', 'type': 'STRING'}
]
}
]",
"function(row, emit) { // function
var c = [];
for (var i = 0; i < row.clicks.length; i++) {
x = row.clicks[i].split(',');
t = {click_time:x[0],
userid:x[1]} ;
c.push(t);
};
emit({item: row.item, clicks: c});
}"
)
result is expected as below