How to run 2 EMR Spark Step Concurrently?

前端 未结 5 2126
太阳男子
太阳男子 2021-01-05 08:56

I am trying to have 2 steps run concurrent in EMR. However I always get the first step running and the second pending.

Part of my Yarn configuration is as follows:<

5条回答
  •  抹茶落季
    2021-01-05 09:37

    There are 2 modes of running application in AWS EMR Yarn:

    • Client
    • Cluster

    If you use client mode then only one step will be in running state at a given time. However there is an option where in you can run more then 1 step concurrently.

    try submitting your step in blow mode: spark-submit --master yarn --deploy-mode cluster --executor-memory 1G --num-executors 2 --driver-memory 1g --executor-cores 2 --conf spark.yarn.submit.waitAppCompletion=false --class WordCount.word.App /home/hadoop/word.jar

    1. Instead of letting AWS EMR define memory allocation try defining your allocation. Refer to link: http://site.clairvoyantsoft.com/understanding-resource-allocation-configurations-spark-application/
    2. spark.yarn.submit.waitAppCompletion=false : In YARN cluster mode, controls whether the client waits to exit until the application completes. If set to true, the client process will stay alive reporting the application's status. Otherwise, the client process will exit after submission.

    Hope this may of help for you.

提交回复
热议问题