Is there a way to continuously pipe data from Azure Blob into BigQuery?

Deadly 提交于 2019-12-23 04:59:42

问题


I have a bunch of files in Azure Blob storage and it's constantly getting new ones. I was wondering if there is a way for me to first take all the data I have in Blob and move it over to BigQuery and then keep a script or some job running so that all new data in there gets sent over to BigQuery?


回答1:


BigQuery offers support for querying data directly from these external data sources: Google Cloud Bigtable, Google Cloud Storage, Google Drive. Not include Azure Blob storage. As Adam Lydick mentioned, as a workaround, you could copy data/files from Azure Blob storage to Google Cloud Storage (or other BigQuery-support external data sources).

To copy data from Azure Blob storage to Google Cloud Storage, you can run WebJobs (or Azure Functions), and BlobTriggerred WebJob can trigger a function when a blob is created or updated, in WebJob function you can access the blob content and write/upload it to Google Cloud Storage.

Note: we can install this library: Google.Cloud.Storage to make common operations in client code. And this blog explained how to use Google.Cloud.Storage sdk in Azure Functions.




回答2:


I'm not aware of anything out-of-the-box (on Google's infrastructure) that can accomplish this.

I'd probably set up a tiny VM to:

  • Scan your Azure blob storage looking for new content.
  • Copy new content into GCS (or local disk).
  • Kick off a LOAD job periodically to add the new data to BigQuery.

If you used GCS instead of Azure Blob Storage, you could eliminate the VM and just have a Cloud Function that is triggered on new items being added to your GCS bucket (assuming your blob is in a form that BigQuery knows how to read). I presume this is part of an existing solution that you'd prefer not to modify though.



来源:https://stackoverflow.com/questions/44806345/is-there-a-way-to-continuously-pipe-data-from-azure-blob-into-bigquery

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!