How can I use the BigQuery REST API from the command line?

后端 未结 1 1008
孤独总比滥情好
孤独总比滥情好 2021-01-05 16:30

Attempting to make a plain GET request to one of the BigQuery REST APIs gives an error that looks like this:

curl https://www.googleapis.com/bigquery/v2/proj         


        
相关标签:
1条回答
  • 2021-01-05 16:59

    As a disclaimer, when working from the command-line, using the bq tool will usually be sufficient, or for more complex use cases, the BigQuery client libraries enable programming with BigQuery from multiple languages. It can still be useful sometimes to make plain requests to the REST APIs to see how certain APIs work at a low level, however.

    First, make sure that you have installed the Google Cloud SDK. This should include the gcloud and bq command-line tools. If you haven't already, authorize your account by running this command from your terminal:

    gcloud auth login
    

    This should prompt you to log in and then give you an access code that you can paste into your terminal. (The exact process may change over time).

    Now let's try a query using the BigQuery REST API, calling the jobs.query method. Modify this script with your own project name, which you can find from the Google Cloud Console, then paste the script into your terminal:

    PROJECT="YOUR_PROJECT_NAME"
    QUERY="\"SELECT 1 AS x, 'foo' AS y;\""
    REQUEST="{\"kind\":\"bigquery#queryRequest\",\"useLegacySql\":false,\"query\":$QUERY}"
    echo $REQUEST | \
      curl -X POST -d @- -H "Content-Type: application/json" \
        -H "Authorization: Bearer $(gcloud auth print-access-token)" \
        https://www.googleapis.com/bigquery/v2/projects/$PROJECT/queries
    

    If it worked, you should see output that looks like this:

    {
     "kind": "bigquery#queryResponse",
     "schema": {
      "fields": [
       {
        "name": "x",
        "type": "INTEGER",
        "mode": "NULLABLE"
       },
       {
        "name": "y",
        "type": "STRING",
        "mode": "NULLABLE"
       }
      ]
     },
     "jobReference": {
      "projectId": "<your project ID>",
      "jobId": "<your job ID>"
     },
     "totalRows": "1",
     "rows": [
      {
       "f": [
        {
         "v": "1"
        },
        {
         "v": "foo"
        }
       ]
      }
     ],
     "totalBytesProcessed": "0",
     "jobComplete": true,
     "cacheHit": false
    }
    

    If you haven't set up the bq command-line tool, you can use bq init from your terminal to do so. Once you have, you can try running the same query using it:

    bq query --use_legacy_sql=False "SELECT 1 AS x, 'foo' AS y;"
    

    You can also see the REST API requests that the bq tool makes by passing the --apilog= option:

    bq --apilog= query --use_legacy_sql=False "SELECT [1, 2, 3] AS x;"
    

    Now let's try an example using the jobs.insert method instead of the query API. Run this script, replacing YOUR_PROJECT_NAME with your project name:

    PROJECT="YOUR_PROJECT_NAME"
    QUERY="\"SELECT 1 AS x, 'foo' AS y;\""
    REQUEST="{\"configuration\":{\"query\":{\"useLegacySql\":false,\"query\":${QUERY}}}}"
    echo $REQUEST | \
    curl -X POST -d @- -H "Content-Type: application/json" \
        -H "Authorization: Bearer $(gcloud auth print-access-token)" \
        https://www.googleapis.com/bigquery/v2/projects/$PROJECT/jobs
    

    Unlike the query API, which returned a response immediately, you will see a result that looks similar to this:

    {
     "kind": "bigquery#job",
     "etag": "\"<etag string>\"",
     "id": "<project name>:<job ID>",
     "selfLink": "https://www.googleapis.com/bigquery/v2/projects/<project name>/jobs/<job ID>",
     "jobReference": {
      "projectId": "<project name>",
      "jobId": "<job ID>"
     },
     "configuration": {
      "query": {
       "query": "SELECT 1 AS x, 'foo' AS y;",
       "destinationTable": {
        "projectId": "<project name>",
        "datasetId": "<anonymous dataset>",
        "tableId": "<anonymous table>"
       },
       "createDisposition": "CREATE_IF_NEEDED",
       "writeDisposition": "WRITE_TRUNCATE",
       "useLegacySql": false
      }
     },
     "status": {
      "state": "RUNNING"
     },
     "statistics": {
      "creationTime": "<timestamp millis>",
      "startTime": "<timestamp millis>"
     },
     "user_email": "<your email address>"
    }
    

    Notice the status:

     "status": {
      "state": "RUNNING"
     },
    

    If you want to check on the job now, you can use the jobs.get method. Similar to before, run this from your terminal, using the job ID from the output in the previous step:

    PROJECT="YOUR_PROJECT_NAME"
    JOB_ID="YOUR_JOB_ID"
    curl -H "Authorization: Bearer $(gcloud auth print-access-token)" \
      https://www.googleapis.com/bigquery/v2/projects/$PROJECT/jobs/$JOB_ID
    

    If the query is done, you'll get a response that indicates as much:

    ...
    "status": {
     "state": "DONE"
    },
    ...
    

    Finally, we can make a request to fetch the query results, also using the REST API.

    curl -H "Authorization: Bearer $(gcloud auth print-access-token)" \
      https://www.googleapis.com/bigquery/v2/projects/$PROJECT/queries/$JOB_ID
    

    The output will look similar to when we used the jobs.query method above:

    {
     "kind": "bigquery#getQueryResultsResponse",
     "etag": "\"<etag string>\"",
     "schema": {
      "fields": [
       {
        "name": "x",
        "type": "INTEGER",
        "mode": "NULLABLE"
       },
       {
        "name": "y",
        "type": "STRING",
        "mode": "NULLABLE"
       }
      ]
     },
     "jobReference": {
      "projectId": "<project ID>",
      "jobId": "<job ID>"
     },
     "totalRows": "1",
     "rows": [
      {
       "f": [
        {
         "v": "1"
        },
        {
         "v": "foo"
        }
       ]
      }
     ],
     "totalBytesProcessed": "0",
     "jobComplete": true,
     "cacheHit": true
    }
    
    0 讨论(0)
提交回复
热议问题