What is the difference between Google Cloud Dataflow and Google Cloud Dataproc?

后端 未结 5 1357
孤街浪徒
孤街浪徒 2021-01-31 01:43

I am using Google Data Flow to implement an ETL data ware house solution.

Looking into google cloud offering, it seems DataProc can also do the same thing.

It

5条回答
  •  闹比i
    闹比i (楼主)
    2021-01-31 02:29

    Cloud Dataproc and Cloud Dataflow can both be used for data processing, and there’s overlap in their batch and streaming capabilities. You can decide which product is a better fit for your environment.

    Cloud Dataproc is good for environments dependent on specific Apache big data components: - Tools/packages - Pipelines - Skill sets of existing resources

    Cloud Dataflow is typically the preferred option for green field environments: - Less operational overhead - Unified approach to development of batch or streaming pipelines - Uses Apache Beam - Supports pipeline portability across Cloud Dataflow, Apache Spark, and Apache Flink as runtimes.

    See more details here https://cloud.google.com/dataproc/

    Pricing comparision:

    • DataProc

    • Dataflow

    If you want to calculate and compare cost of more GCP resources, please refer this url https://cloud.google.com/products/calculator/

提交回复
热议问题