Workflow scheduling on GCP Dataproc cluster
问题 I have some complex Oozie workflows to migrate from on-prem Hadoop to GCP Dataproc. Workflows consist of shell-scripts, Python scripts, Spark-Scala jobs, Sqoop jobs etc. I have come across some potential solutions incorporating my workflow scheduling needs: Cloud Composer Dataproc Workflow Template with Cloud Scheduling Install Oozie on Dataproc auto-scaling cluster Please let me know which option would be most efficient in terms of performance, costing and migration complexities. 回答1: All 3