How shall I create a Athena table from the nested json file ? This is my sample json file. I only need selected key value pairs like roofcondition and garagestalls.
First of all you sent wrong version of the JSON document, correct version should look like this:
{"reportId":"7bc7fa76-bf53-4c21-85d6-118f6a8f4244", "reportOrderedTS":"1529996028730", "createdTS":"1530304910154", "report":{"summaryElements": [{"value": "GOOD", "key": "roofCondition"},{"value": "98", "key": "storiesConfidence"},{"value": "0", "key": "garageStalls"}], "elements": [{"source": "xyz", "imageId": "0xxx_png", "modelVersion": "1.21.0", "key": "pool"},{"source": "xyz", "imageId": "0111_png", "value": "GOOD", "modelVersion": "1.36.0", "key": "roofCondition", "confidence": "49"}] }, "status":"Success", "reportReceivedTS":"1529996033830" }
Yes, you can query the table on Athena with nested json. You can achieved this, for example by creating the following table:
CREATE EXTERNAL TABLE example(
`reportId` string,
`reportOrderedTS` bigint,
`createdTS` bigint,
`report` struct<
`summaryElements`: array>,
`elements`: array>>,
`status` string,
`reportReceivedTS` bigint
)
ROW FORMAT SERDE 'org.openx.data.jsonserde.JsonSerDe'
LOCATION 's3://example'
This is example query:
select reportid,reportorderedts,createdts,
summaryelements.value, summaryelements.key, elements.source, elements.key
from example, UNNEST(report.summaryelements) t(summaryelements), UNNEST(report.elements) t(elements)
Useful links:
https://docs.aws.amazon.com/athena/latest/ug/flattening-arrays.html
https://docs.aws.amazon.com/athena/latest/ug/rows-and-structs.html