Configure security features
Hadoop Eco can apply security features using Apache Ranger and Kerberos.
HDE-1.0.1 version does not support security features. Some components that do not support Kerberos may not function properly, and security features may cause issues when integrating with Data Catalog.
Kerberos
Kerberos is a user authentication protocol developed by MIT.
When Kerberos authentication is applied, a KDC is installed, and Kerberos authentication is applied to Hadoop, Hive, and HBase.
After applying Kerberos, you need to authenticate using Kerberos to use Hadoop, Hive, and HBase. A keytab will be created at '/etc/hadoopeco.keytab', and the default list of users can be checked using this keytab.
# Check default users
klist -kt /etc/hadoopeco.keytab
# Authentication method
kinit -kt /etc/hadoopeco.keytab hdfs/{host_name}@{realm_name}
For instructions on installing Kerberos, please refer to the installation and integration guide.
Change component ports
When Kerberos is configured, the default ports of certain components are changed. The list is as follows:
HDE Version | Component | Original Port | Changed Port | Remarks |
---|---|---|---|---|
HDE-1.x | HDFS Namenode | 50070 | 50470 | Access via HTTPS |
HDFS SecondaryNamenode | 50090 | 50091 | ||
HDFS Datanode | 50075 | 50475 | Access via HTTPS | |
HDE-2.x | HDFS Namenode | 9870 | 9871 | Access via HTTPS |
HDFS SecondaryNamenode | 9868 | 9869 | ||
HDFS Datanode | 9864 | 9867 | Access via HTTPS |
Ranger
Hadoop Eco uses Apache Ranger for applying ACLs and auditing (Audit).
The default ID for Ranger is admin
, and the password is the admin password set during cluster creation. Ranger requires the following password rules:
- At least 8 characters
- At least one uppercase letter, one lowercase letter, and one number
- Certain special characters (\'""`) are not supported
Access Ranger
Ranger is installed on the master server if it's a single node or on the master server number 3 in a high-availability setup, and it can be accessed via port 6080.
- Check the security group of the VM where Ranger is installed and add the public IP.
- Log in with
admin
andrangerAdmin1
. - By default, it supports HDFS, YARN, and Hadoop SQL plugins.
Ranger policies
After modifying policies in Ranger, the agent must fetch the policies and apply them.
You can confirm that the policy has been downloaded and is in the Active state in the Plugin Status, and verify that operations are being executed.
After modifying HDFS policies, you can confirm that the ranger-acl is applied in the Access Enforcer and that user access is restricted.
Installation and integration
Installation method
To install Kerberos and Ranger during Hadoop Eco creation, add the following information to the Cluster Configuration Settings in step 3:
{
"configurations":
[
{
"classification": "kerberos-setting",
"properties":
{
"enabled": true, # Whether Kerberos is enabled
"passwd": "bigadmin", # KDC password
"realm": "HADOOP.ECO" # Default realm name
}
},
{
"classification": "ranger-setting",
"properties":
{
"enabled": true
}
}
]
}
Integration method
Ranger is integrated with Kerberos to verify users.
To verify actions, add users to each node and generate the Kerberos keytab.
Add users
To authenticate with Kerberos, add users to the desired group on the master node.
On the master node, add users to the group you want, and on the worker node, add users to the hadoop group.
# master node
groupadd {group_name}
useradd {user_name} -g {group_name}
usermod -G {group_name} {user_name}
# worker node
useradd {user_name} -g hadoop
usermod -G hadoop {user_name}
# Apply to NameNode
kinit -kt /etc/hadoopeco.keytab hdfs/{host_name}@{realm_name}
hdfs dfsadmin -refreshServiceAcl
hdfs dfsadmin -refreshUserToGroupsMappings
hdfs dfsadmin -refreshSuperUserGroupsConfiguration
# Apply to ResourceManager
kinit -kt /etc/hadoopeco.keytab yarn/{host_name}@{realm_name}
yarn rmadmin -refreshSuperUserGroupsConfiguration
yarn rmadmin -refreshUserToGroupsMappings
yarn rmadmin -refreshAdminAcls
yarn rmadmin -refreshServiceAcl
Keytab issuance and login
Register the user's policy (principal) in KDC and issue a keytab.
# Log in to kadmin
sudo kadmin.local
# Register user and issue keytab in kadmin CLI
addprinc -randkey {user_name}@{realm_name}
xst -k {keytab_file_name} {user_name}@{realm_name}
After issuing the keytab, you can verify that the file is created in the specified location.
# Check the list of users registered in the keytab
klist -kt {keytab_file_name}
# Log in with the account in the keytab
kinit -kt {keytab_file_name} {user_name}@{realm_name}
# Check the authenticated user
klist
Verify functionality
After registering the user and executing the task, you can confirm that HDFS, YARN, and Hadoop SQL ACLs are applied as shown below.