Turning JSON blob into BQ friendly format with JQ

我只是一个虾纸丫 提交于 2021-02-16 21:24:15

问题


To be upfront, I have next to no experience with JSON, JQ, or much of anything on the Java side. I've been spending a lot of time trying to use the jq command line function to properly format a test blob of data in a way that I can easily feed into Google BigQuery.

{
    "total_items": 848,
    "page_count": 34,
    "items": [
        {
            "landing_id": "708d9e3eb106820f98162d879198774b",
            "token": "708d9e3eb106820f98162d879198774b",
            "response_id": "708d9e3eb106820f98162d879198774b",
            "landed_at": "2019-02-12T01:58:02Z",
            "submitted_at": "2019-02-12T01:58:31Z",
            "metadata": {
                "user_agent": "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/72.0.3560.98 Safari/537.36",
                "platform": "other",
                "referer": "https://test.typeform.com/to/LTYE9W?prefilled_answer=8&email=test.x20a@gmail.com",
                "network_id": "35b9eae170",
                "browser": "default"
            },
            "answers": [
                {
                    "field": {
                        "id": "fX64BkjuxYy1",
                        "type": "opinion_scale",
                        "ref": "97f8e18ad06a02e6"
                    },
                    "type": "number",
                    "number": 8
                },
                {
                    "field": {
                        "id": "lYeFxbL67g8B",
                        "type": "multiple_choice",
                        "ref": "78d09e15-7d42-49ec-9f9d-004bf7d3058a"
                    },
                    "type": "choices",
                    "choices": {
                        "labels": [
                            "Experience"
                        ]
                    }
                },
                {
                    "field": {
                        "id": "D3ubKSVfNnlY",
                        "type": "multiple_choice",
                        "ref": "684cb3bd-09cb-4f27-8e7d-baef6a09f787"
                    },
                    "type": "choices",
                    "choices": {
                        "labels": [
                            "Condition"
                        ]
                    }
                },
                {
                    "field": {
                        "id": "UccviSuUQPio",
                        "type": "yes_no",
                        "ref": "ed7e0d9c-5b62-4b0f-9395-54a53d125711"
                    },
                    "type": "boolean",
                    "boolean": false
                }
            ],
            "hidden": {
                "email": "test.x20a@gmail.com"
            }
        }
        ]
}

I've been using this tutorial, but with no success, and it's getting incredibly frustrating

Let's assume I want all fields, but I want to get rid of that top part that includes total_items and page_count. So essentially, everything beginning with landing_id. I apologize for not going further into my previous attempts to give you all a baseline, but I just haven't gotten anywhere.


回答1:


For the given example:

 jq -c  .items[] lala.json  > lala.jq.json

Then you can load into BigQuery:

 bq load --source_format NEWLINE_DELIMITED_JSON --autodetect fh-bigquery:deleting.testjson lala.jq.json

And then it's ready to be queried:

Note that the answer from user "peak" is missing -c and [] to control the output and splitting into different items from the array.



来源:https://stackoverflow.com/questions/54657231/turning-json-blob-into-bq-friendly-format-with-jq

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!