How to run 2 EMR Spark Step Concurrently?

前端 未结 5 2130
太阳男子
太阳男子 2021-01-05 08:56

I am trying to have 2 steps run concurrent in EMR. However I always get the first step running and the second pending.

Part of my Yarn configuration is as follows:<

5条回答
  •  心在旅途
    2021-01-05 09:43

    • Is it possible to have the step run concurrently or only serially?

      • Confirmed from AWS support people that we can not run multiple steps in parallel(concurrent), the steps are serial, so what you are seeing (ie second job in pending state) is expected.
    • Is there any tips or something specific to run to job concurrently?

      • You can simply put both the spark-submit in a bash script and run the bash script, but you might loose some direct debugging info on the AWS web console (which imo is slow already), you can see these debugging info on the spark-history server

    On your local mac, you are able to run multiple YARN application in parallel because you are submitting the applications to yarn directly, whereas in EMR the yarn/spark applications are submitted through AWS's internal `command-runner.jar`, it does a bunch of other logging/bootstrapping etc to be able to see the `emr step` info on the web console.

提交回复
热议问题