Reading batches of data from BigQuery into Datalab

耗尽温柔 提交于 2019-12-05 22:13:51

If your purpose is to visualize the data, would sampling be better than loading a small batch?

You can sample your data such as:

import google.datalab.bigquery as bq
df = bq.Query(sql='SELECT image_url, label FROM coast.train WHERE rand() < 0.01').execute().result().to_dataframe()

Or, a use convenient class:

from google.datalab.ml import BigQueryDataSet
sampled_df = BigQueryDataSet(table='myds.mytable').sample(1000)

Have you tried just iterating over the table? The Table object is an iterable that uses a paged fetcher to get data from the BigQuery table, it is streaming in a way. The page size is 1024 by default.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!