Skip to main content

Configure scheduling

Configure YARN scheduler

The following explains how to configure YARN's default schedulers: Capacity Scheduler and Fair Scheduler.

Scheduler types

Capacity Scheduler

Capacity Scheduler is YARN's default scheduler, managing YARN resources by declaring tree-structured queues and allocating capacity to each queue.

Configuration keys for Capacity Scheduler
Configuration keyDescription
yarn.scheduler.capacity.maximum-applicationsMaximum number of applications that can be set to PRE or RUNNING status.
yarn.scheduler.capacity.maximum-am-resource-percentMaximum percentage of resources that can be allocated to the Application Master (AM).
yarn.scheduler.capacity.root.queuesRegister the names of child queues under the root queue.
yarn.scheduler.capacity.root.[queue_name].maximum-am-resource-percentPercentage of resources the AM can use in the queue.
yarn.scheduler.capacity.root.[queue_name].capacityCapacity percentage allocated to the queue.
yarn.scheduler.capacity.root.[queue_name].user-limit-factorThe queue can use resources up to the limit-factor of the assigned capacity, but cannot exceed maximum-capacity.
yarn.scheduler.capacity.root.[queue_name].maximum-capacityMaximum capacity the queue can use.
Configure capacity scheduler
<configuration>

<property>
<name>yarn.scheduler.capacity.maximum-applications</name>
<value>10000</value>
</property>

<property>
<name>yarn.scheduler.capacity.maximum-am-resource-percent</name>
<value>0.1</value>
</property>

<property>
<name>yarn.scheduler.capacity.resource-calculator</name>
<value>org.apache.hadoop.yarn.util.resource.DefaultResourceCalculator</value>
</property>

<property>
<name>yarn.scheduler.capacity.root.queues</name>
<value>prd,stg</value>
</property>

<property>
<name>yarn.scheduler.capacity.root.prd.capacity</name>
<value>80</value>
</property>

<property>
<name>yarn.scheduler.capacity.root.stg.capacity</name>
<value>20</value>
</property>

<property>
<name>yarn.scheduler.capacity.root.prd.user-limit-factor</name>
<value>1</value>
</property>

<property>
<name>yarn.scheduler.capacity.root.stg.user-limit-factor</user-limit-factor</name>
<value>2</value>
</property>

<property>
<name>yarn.scheduler.capacity.root.prd.maximum-capacity</name>
<value>100</value>
</property>

<property>
<name>yarn.scheduler.capacity.root.stg.maximum-capacity</name>
<value>30</value>
</property>

</configuration>

Fair Scheduler

Fair Scheduler ensures that submitted jobs share resources equally. When jobs are submitted to a queue, the cluster adjusts resources to allocate them evenly across all jobs.

Configuration keys for Fair SchedulerDescription
yarn.scheduler.fair.allocation.fileName of the Fair Scheduler configuration file.
yarn.scheduler.fair.user-as-default-queueWhether to use the default queue when a queue name is not specified.
yarn.scheduler.fair.preemptionWhether to enable priority preemption.
Fair Scheduler configuration example
<?xml version="1.0"?>
<allocations>
<queue name="dev">
<minResources>10000 mb,10vcores</minResources>
<maxResources>60000 mb,30vcores</maxResources>
<maxRunningApps>50</maxRunningApps>
<maxAMShare>1.0</maxAMShare>
<weight>2.0</weight>
<schedulingPolicy>fair</schedulingPolicy>
</queue>

<queue name="prd">
<minResources>10000 mb,10vcores</minResources>
<maxResources>60000 mb,30vcores</maxResources>
<maxRunningApps>100</maxRunningApps>
<maxAMShare>0.1</maxAMShare>
<weight>2.0</weight>
<schedulingPolicy>fair</schedulingPolicy>
<queue name="sub_prd">
<aclSubmitApps>charlie</aclSubmitApps>
<minResources>5000 mb,0vcores</minResources>
</queue>
</queue>


<user name="sample_user">
<maxRunningApps>30</maxRunningApps>
</user>
<userMaxAppsDefault>5</userMaxAppsDefault>

<queueMaxAMShareDefault>0.2</queueMaxAMShareDefault>

<queuePlacementPolicy>
<rule name="specified"/>
<rule name="primaryGroup" create="false"/>
<rule name="default" queue="dev"/>
</queuePlacementPolicy>
</allocations>

Change scheduler

The default scheduler in Hadoop Eco is Capacity Scheduler. To switch to Fair Scheduler, modify the yarn-site.xml configuration and restart the service.

Scheduler change example
<!--- Capacity Scheduler --->
<property>
<name>yarn.resourcemanager.scheduler.class</name>
<value>org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler</value>
</property>

<!--- Fair Scheduler --->
<property>
<name>yarn.resourcemanager.scheduler.class</name>
<value>org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler</value>
</property>

Update scheduler configuration

You can update the scheduler configuration while the ResourceManager is running. After modifying the XML configuration file, run the following commands:

Update scheduler configuration example
yarn rmadmin -refreshQueues