How to configure monopolistic FIFO application queue in YARN?

前端 未结 2 1305
孤城傲影
孤城傲影 2021-01-13 00:09

I need to disable parallel execution of YARN applications in hadoop cluster. Now, YARN has default settings, so several jobs can run in parallel. I see no advantages of this

相关标签:
2条回答
  • 2021-01-13 00:31

    1) Change Scheduler to FairScheduler

    Hadoop distributions use CapacityScheduler by default (Cloudera uses FairScheduler as default Scheduler). Add this property to yarn-site.xml

    <property>
      <name>yarn.resourcemanager.scheduler.class</name>
      <value>org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler</value>
    </property>
    

    2) Set default Queue

    Fair Scheduler creates a queue per user. I.E., if three different users submit jobs then three individual queues will be created and the resources will be shared among the three queues. Disable it by adding this property in yarn-site.xml

    <property>
      <name>yarn.scheduler.fair.user-as-default-queue</name>
      <value>false</value>
    </property>
    

    This assures that all the jobs go into a single default queue.

    3) Restrict Maximum Applications

    Now that the job queue has been limited to one default queue. Restrict the maximum number of applications to 1 that can be run in that queue.

    Create a file named fair-scheduler.xml under the $HADOOP_CONF_DIR and add these entries

    <allocations>
       <queueMaxAppsDefault>1</queueMaxAppsDefault>
    </allocations>
    

    Also, add this property in yarn-site.xml

    <property>
      <name>yarn.scheduler.fair.allocation.file</name>
      <value>$HADOOP_CONF_DIR/fair-scheduler.xml</value>
    </property>
    

    Restart YARN services after adding these properties.


    On submitting multiple applications, the application ACCEPTED first will be considered as the Active application and the remaining will be queued as Pending applications. These pending applications will continue to be in ACCEPTED state until the RUNNING application is FINISHED. The Active application will be allowed to utilise all the available resources.

    Reference: Hadoop: Fair Scheduler

    0 讨论(0)
  • 2021-01-13 00:51

    As per my understanding about your question. I see, the above code line/setting only may not help you. Can you check below code with your existing setup, it may give you some solution.

    <allocations>
      <defaultQueueSchedulingPolicy>fair</defaultQueueSchedulingPolicy>
    
      <queue name="<<Your Queue Name>>"
        <weight>40</weight>
        <schedulingPolicy>fifo</schedulingPolicy>
      </queue>
    
      <queue name=<<Your Queue Name>>>
        <weight>60</weight>
        <queue name=<<Your Queue Name>> />
        <queue name=<<Your Queue Name>> />
      </queue>
    
      <queuePlacementPolicy>
        <rule name="specified" create="false" />
        <rule name="primaryGroup" create="false" />
        <rule name="default" queue=<<Your Queue Name>> />
      </queuePlacementPolicy>
    </allocations>
    
    0 讨论(0)
提交回复
热议问题