What are the pros and cons of loading data directly into Google BigQuery vs going through Cloud Storage first?

匆匆过客 提交于 2020-01-13 19:13:06

问题


Also, is there anything wrong with doing transforms/joins directly within BigQuery? I'd like to minimize the number of components and steps involved for a data warehouse I'm setting up (simple transaction and inventory data for a chain of retail stores.)


回答1:


Loading data via Cloud Storage is the fastest (and the cheapest) way. Loading directly can be done via app (using streaming insert which add some additional cost)

For the doing transformation - if what are you plan/need to do can be done in BigQuery - you should do it in BigQuery :) - it is the best and fastest way of doing ETL. But you should take in account cost of running query (if you not paying Google for slots - it could be 5$ per 1TB scans)

Another good options for complex ETL is using Data Flow - but it can became expensive very quick - in exchange of more flexibility.




回答2:


Well, if you go through GCS it means you are not streaming your data, and loading from file to BQ is free, and files can be up to 5TB in size. Which is sometimes and advantage, the large file capability and being free. Also streamin is realtime, and going through GCS means it's not realtime.

If you want to directly stream data into BQ tables that has a cost. Currently the price for streaming is $0.01 per 200 MB (June 2018), so around $50 for 1TB.

On the other hand, transformation can be done with SQL if you can express the task. Otherwise you have plenty of options, people most of the time us a Dataflow to transform things. See the linked tutorial for an advanced example.

Look also into
Cloud Dataprep - Data Preparation and Data Cleansing and
Google Data Studio: Easily Build Custom Reports and Dashboards

Also an advanced example:

Performing ETL from a Relational Database into BigQuery



来源:https://stackoverflow.com/questions/51065570/what-are-the-pros-and-cons-of-loading-data-directly-into-google-bigquery-vs-goin

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!