Skip to main content

Hadoop Eco security examples

YARN is the unified resource management platform for Hadoop systems, a framework for job scheduling and cluster resource management within Hadoop. YARN opens the default port 8088 when executing jobs via REST API calls. The user is set to dr.who by default, and the basic jobs are executed.
During this process, ACLs are not checked, which allows external attackers to use the exposed 8088 port to perform malicious actions, such as executing commands on the server or downloading malicious scripts.

Image User: Default job executed as dr.who

info

All Hadoop Eco systems used in public clouds are susceptible to malicious external attacks.
Therefore, you should either restrict REST API calls entirely or apply appropriate security measures when using REST API.

Restrict REST API calls

To defend against malicious external attacks, you can restrict job execution through the REST API. Starting from Hadoop Eco version 2.10.1, a feature has been added to restrict job submissions through the REST API. By default, Hadoop Eco uses this option as true (allowing REST API calls).

  1. To prevent REST API calls, modify the yarn.webapp.enable-rest-app-submissions setting to false in the /etc/hadoop/conf/yarn-site.xml file and restart the resource manager.

    Restrict REST API job execution
    <property>
    <name>yarn.webapp.enable-rest-app-submissions</name>
    <value>false</value>
    </property>
    ParameterDescription
    yarn.webapp.enable-rest-app-submissionsWhether to allow REST API calls
    - true (default): Allows calls
    - false: Disallows calls
  2. If this option is enabled to restrict job execution, the following message will appear when trying to execute a job via REST API.

    Image Job execution restriction option

Security measures when using REST API

If you must use the REST API, take the following actions to avoid security vulnerabilities. If users need to use REST API for job execution, consider the following approaches.

Method 1: Keep port 8088 private

The first method is not to open port 8088 to the outside. Hadoop Eco does not use public IPs by default, so if no public IP is assigned to the VM, there will be no external access. If you do not assign a public IP, external access will not be possible.

  1. Go to KakaoCloud Console > Analytics > Hadoop Eco.
  2. In the Cluster List, select the cluster where you want to execute the job.
  3. Click the Node List tab and check the node list for the selected cluster.
  4. Verify that the Public IP is not assigned in the node list.
Method 2: Modify inbound rules

The second method is to modify the inbound rules. It is recommended to change the default 0.0.0.0/0 open rule to the IP of the source machine you want to access. In the security group’s inbound rules, set the packet source to the public IP of the host you wish to access.

  1. Go to KakaoCloud Console > Analytics > Hadoop Eco.

  2. In the Cluster List, select the cluster where you want to execute the job.

  3. Click the Node List tab and check the node list for the selected cluster.

  4. Select the Node instance name from the node list.

  5. On the instance detail page of the selected node, select the Security tab.

  6. In the Inbound tab, check the Packet source.

  7. Click Applied security group to go to the Security tab of the VPC.

  8. In the inbound rules tab, click Manage inbound rules to change the policy.

    Image Check Packet source in the Inbound tab

Method 3: Change resource manager port

The third method is to change the resource manager port information, but this should only be done if you absolutely must expose the resource manager information externally. Changing the port information may cause operations such as scaling to fail.
After modifying the following information in yarn-site.xml, restart the resource manager to change the access port.

<property>
<name>yarn.resourcemanager.webapp.address</name>
<value>127.0.0.1:8088</value>
</property>