Python: How to update a value in Google BigQuery in less than 40 seconds?

穿精又带淫゛_ 提交于 2021-02-19 03:43:07

问题


I have a table in Google BigQuery that I access and modify in Python using the pandas functions read_gbq and to_gbq. The problem is that appending 100,000 lines takes about 150 seconds while appending 1 line takes about 40 seconds. I would like to update a value in the table rather than append a line, is there a way to update a value in the table using python that is very fast, or faster than 40 seconds?


回答1:


Not sure if you can do so using pandas but you sure can using google-cloud library.

You could just install it (pip install --upgrade google-cloud) and run it like:

import uuid
import os
os.environ['GOOGLE_APPLICATION_CREDENTIALS'] = 'path_to_json_credentials.json'
from google.cloud.bigquery.client import Client

bq_client = Client()

job_id = str(uuid.uuid4())
query = """UPDATE `dataset.table` SET field_1 = '3' WHERE field_2 = '1'"""
job = bq_client.run_async_query(query=query, job_name=job_id)
job.use_legacy_sql = False
job.begin()

Here this operation is taking 2s on average.

As a side note, it's important to keep in mind the quotas related to DML operations in BQ, that is, know when it's appropriate to use them and if they fit your needs well.



来源:https://stackoverflow.com/questions/45003276/python-how-to-update-a-value-in-google-bigquery-in-less-than-40-seconds

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!