发表新帖

发表新帖

What is the difference between Google Cloud Dataflow and Google Cloud Dataproc?

后端未结

关注

 5  1368

孤街浪徒 2021-01-31 01:43

I am using Google Data Flow to implement an ETL data ware house solution.

Looking into google cloud offering, it seems DataProc can also do the same thing.

It

5条回答

囚心锁ツ (楼主)

2021-01-31 02:32

Same reason as why Dataproc offers both Hadoop and Spark: sometimes one programming model is the best fit for the job, sometimes the other. Likewise, in some cases the best fit for the job is the Apache Beam programming model, offered by Dataflow.

In many cases, a big consideration is that one already has a codebase written against a particular framework, and one just wants to deploy it on the Google Cloud, so even if, say, the Beam programming model is superior to Hadoop, someone with a lot of Hadoop code might still choose Dataproc for the time being, rather than rewriting their code on Beam to run on Dataflow.

The differences between Spark and Beam programming models are quite large, and there are a lot of use cases where each one has a big advantage over the other. See https://cloud.google.com/dataflow/blog/dataflow-beam-and-spark-comparison .

0 讨论(0)

查看其它5个回答
发布评论:

提交评论
- 加载中...

热议问题