Skip to main content

Configure scheduling

Configure YARN scheduler

Below are the methods for configuring the two main YARN schedulers: the Capacity Scheduler and the Fair Scheduler.

Define scheduler types

Capacity Scheduler

The Capacity Scheduler is the default scheduler in YARN. It manages YARN resources by declaring a tree structure of queues and allocating available resources to each queue.

Capacity Scheduler configuration keys
Configuration keyValue
yarn.scheduler.capacity.maximum-applicationsThe maximum number of applications that can be in PRE or RUNNING state.
yarn.scheduler.capacity.maximum-am-resource-percentThe maximum percentage of resources that can be allocated to the Application Master (AM).
yarn.scheduler.capacity.root.queuesNames of the child queues registered under the root queue.
yarn.scheduler.capacity.root.[queue_name].maximum-am-resource-percentThe percentage of resources the AM can use in the queue.
yarn.scheduler.capacity.root.[queue_name].capacityThe resource capacity percentage of the queue.
yarn.scheduler.capacity.root.[queue_name].user-limit-factorThe limit factor for other queues' capacity usage, but cannot exceed maximum-capacity.
yarn.scheduler.capacity.root.[queue_name].maximum-capacityThe maximum resource capacity the queue can use.
Capacity Scheduler Configuration
<configuration>

<property>
<name>yarn.scheduler.capacity.maximum-applications</name>
<value>10000</value>
</property>

<property>
<name>yarn.scheduler.capacity.maximum-am-resource-percent</name>
<value>0.1</value>
</property>

<property>
<name>yarn.scheduler.capacity.resource-calculator</name>
<value>org.apache.hadoop.yarn.util.resource.DefaultResourceCalculator</value>
</property>

<property>
<name>yarn.scheduler.capacity.root.queues</name>
<value>prd,stg</value>
</property>

<property>
<name>yarn.scheduler.capacity.root.prd.capacity</name>
<value>80</value>
</property>

<property>
<name>yarn.scheduler.capacity.root.stg.capacity</name>
<value>20</value>
</property>

<property>
<name>yarn.scheduler.capacity.root.prd.user-limit-factor</name>
<value>1</value>
</property>

<property>
<name>yarn.scheduler.capacity.root.stg.user-limit-factor</user-limit-factor</name>
<value>2</value>
</property>

<property>
<name>yarn.scheduler.capacity.root.prd.maximum-capacity</name>
<value>100</value>
</property>

<property>
<name>yarn.scheduler.capacity.root.stg.maximum-capacity</name>
<value>30</value>
</property>

</configuration>

Fair Scheduler

The Fair Scheduler ensures that submitted jobs equally share resources. When jobs are submitted to a queue, the cluster adjusts resources to allocate them evenly across all jobs.

Fair Scheduler configuration keyValue
yarn.scheduler.fair.allocation.fileThe name of the Fair Scheduler configuration file.
yarn.scheduler.fair.user-as-default-queueWhether to use the default queue when no queue name is specified.
yarn.scheduler.fair.preemptionWhether to allow preemption for priority scheduling.
Fair Scheduler Configuration Example
<?xml version="1.0"?>
<allocations>
<queue name="dev">
<minResources>10000 mb,10vcores</minResources>
<maxResources>60000 mb,30vcores</maxResources>
<maxRunningApps>50</maxRunningApps>
<maxAMShare>1.0</maxAMShare>
<weight>2.0</weight>
<schedulingPolicy>fair</schedulingPolicy>
</queue>

<queue name="prd">
<minResources>10000 mb,10vcores</minResources>
<maxResources>60000 mb,30vcores</maxResources>
<maxRunningApps>100</maxRunningApps>
<maxAMShare>0.1</maxAMShare>
<weight>2.0</weight>
<schedulingPolicy>fair</schedulingPolicy>
<queue name="sub_prd">
<aclSubmitApps>charlie</aclSubmitApps>
<minResources>5000 mb,0vcores</minResources>
</queue>
</queue>

<user name="sample_user">
<maxRunningApps>30</maxRunningApps>
</user>
<userMaxAppsDefault>5</userMaxAppsDefault>

<queueMaxAMShareDefault>0.2</queueMaxAMShareDefault>

<queuePlacementPolicy>
<rule name="specified"/>
<rule name="primaryGroup" create="false"/>
<rule name="default" queue="dev"/>
</queuePlacementPolicy>
</allocations>

Change scheduler

The default scheduler for Hadoop Eco is the Capacity Scheduler. To switch to the Fair Scheduler, modify the yarn-site.xml configuration and restart the service.

Scheduler Change Example
<!--- Capacity Scheduler --->
<property>
<name>yarn.resourcemanager.scheduler.class</name>
<value>org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler</value>
</property>

<!--- Fair Scheduler --->
<property>
<name>yarn.resourcemanager.scheduler.class</name>
<value>org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler</value>
</property>

Update scheduler configuration

When modifying the configuration for individual queues, the Resource Manager can apply changes while the service is running. After changing the settings in the XML file, execute the following command to apply the changes.

Scheduler Configuration Update Example
yarn rmadmin -refreshQueues