Triggering a Dataflow job when new files are added to Cloud Storage

女生的网名这么多〃 提交于 2020-01-14 04:53:31

问题


I'd like to trigger a Dataflow job when new files are added to a Storage bucket in order to process and add new data into a BigQuery table. I see that Cloud Functions can be triggered by changes in the bucket, but I haven't found a way to start a Dataflow job using the gcloud node.js library.

Is there a way to do this using Cloud Functions or is there an alternative way of achieving the desired result (inserting new data to BigQuery when files are added to a Storage bucket)?


回答1:


This is supported in Apache Beam starting with 2.2. See Watching for new files matching a filepattern in Apache Beam.




回答2:


Maybe this post would help on how to trigger Dataflow pipelines from App Engine or Cloud Functions?

https://cloud.google.com/blog/big-data/2016/04/scheduling-dataflow-pipelines-using-app-engine-cron-service-or-cloud-functions



来源:https://stackoverflow.com/questions/36365058/triggering-a-dataflow-job-when-new-files-are-added-to-cloud-storage

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!