Manage cluster
A cluster is a collection of nodes provisioned using virtual machines. The method for managing clusters in the Hadoop Eco service is as follows.
View cluster list
You can check the list of currently created clusters and their basic information.
-
Go to KakaoCloud console > Analytics > Hadoop Eco.
-
In the Cluster menu, view the cluster list.
Category Description Cluster filter Query specific clusters through filters or search by keyword.
- Selected items operate with an AND condition, while general keyword searches operate with an OR condition.Name The user-defined cluster name. Status The status of the cluster.
- Clusters with a Terminated status (including those deleted due to errors) are displayed for 90 days.
- For detailed descriptions of each status, refer to Cluster lifecycle.Type Core Hadoop, HBase, Trino, Dataflow Node count The total number of instances used in the cluster configuration. Open API key status Status of the Open API key. Creation date The date the cluster was created. Uptime The duration the cluster has been operating. More options - Cluster clone: Can clone regardless of cluster status.
- Issue Open API key: Issue API key (visible when Open API status is available).
- Adjust worker node count: Increase or decrease the number of worker nodes.
- Delete cluster: Delete the cluster.
View cluster details
You can check detailed information about the cluster, node details, node list, action logs, and monitoring data.
-
In the Cluster menu, select the cluster for which you want to view details.
-
Check the information on the cluster detail page.
Category Description Cluster status The status of the cluster. Total instance uptime Total uptime of the instances. Creator The user who requested cluster creation. Creation date The date the cluster was created.
Check cluster information
You can review cluster tasks, task scheduling, and service integration details.
Category | Description |
---|---|
Cluster information | Overall cluster-related information. |
Cluster detailed settings (optional) | HDFS replication factor HDFS block size Cluster configuration settings |
Service integration (optional) | Service integration details - Monitoring agent installation status - Data Catalog integration - MySQL database name - MySQL database ID |
Task scheduling (optional) | Task scheduling details - Task type - Task completion actions - Executable file - Scheduling log file storage |
Node information
You can view information about master and worker nodes.
Category | Description |
---|---|
Node type | Node type (master or worker) |
Instance name | Click the instance name to navigate to the detailed Virtual Machine page and use available actions. |
Instance ID | The ID of the node instance. |
Type | The type of the node instance. |
Private IP | The private IP of the node instance. |
Public IP | The public IP of the node instance. |
Status | The status of the node instance. |
Check monitoring
You can view detailed monitoring metrics for each node in the Hadoop Eco cluster, including HDFS, YARN, Namenode, Resource Manager, Nodes, and HBase.
Monitoring for Trino
cluster type will be supported in the future.
- In the Cluster menu, select the cluster you want to monitor.
- Click the Monitoring tab and check the monitoring metrics of the selected cluster.
Category | Description |
---|---|
Data period | The period for monitoring metrics. - Period: 1 hour (default) / 3 hours / 12 hours / 1 day / 7 days |
Query items | Metrics to monitor. - Default items (without agent installation): HDFS / YARN / Namenode / ResourceManager / HBase / Kafka - With monitoring agent installed: default items / Nodes - All items are queried by default, and multiple items can be selected. |
Auto-refresh interval | Set auto-refresh intervals. - Interval: No auto-refresh (default) / 10 seconds / 30 seconds / 1 minute / 5 minutes |
Manual refresh | Click to refresh monitoring results manually. |
Monitoring data provided
Category | Monitoring data | Description |
---|---|---|
HDFS | HDFS usage (%) | HDFS usage percentage. |
Active datanodes (count) | Number of active datanodes. | |
YARN | YARN memory usage (Bytes) | Total available memory and memory in use. |
YARN CPU usage (%) | Total vCore usage and vCores in use. | |
Active nodemanagers (count) | Number of active nodemanagers. | |
Active applications (count) | Number of active applications. | |
Namenode | Heap size (Bytes) | Total available heap memory and heap memory in use. |
ResourceManager | Heap size (Bytes) | Total available heap memory and heap memory in use. |
Nodes | Node CPU usage (%) | CPU usage per node instance (available if the agent is installed). |
Node memory usage (%) | Memory usage per node instance (available if the agent is installed). | |
HBase | HMaster heap size (Bytes) | Total available heap memory and heap memory in use (only available for HBase cluster type). |
Active region servers (count) | Number of active region servers (only available for HBase cluster type). | |
Kafka | Active brokers (count) | Number of active Kafka brokers (only available for Dataflow cluster type). |
Created topics (count) | Number of created topics (only available for Dataflow cluster type). |
Check action logs
If you created the Hadoop Eco cluster using an Open API key, you can check the action logs.
- In the Cluster menu, select the cluster to view action logs.
- Click the Action logs tab and review the action log details.
Category | Description |
---|---|
Action log filter | Search action logs by selecting filter attributes or entering keywords. |
Request ID | The request ID of the cluster. |
Status | The status of the cluster. |
Task result | Result of the cluster task. |
User | Email of the cluster user. |
Uptime / Instance uptime | Cluster operating time. - Hovering over the time displays the cluster request time. |
Action log details | Click the icon to view detailed information about each action log. - Worker node count: Number of configured worker nodes. - Worker volume size: Volume size of the worker node. - HDFS replication factor: Number of HDFS (Hadoop Distributed File System) replications. - HDFS block size: HDFS block size. - User task options: User-specified task options. |
Clone cluster
You can clone a cluster with the same configuration.
Clusters can be cloned regardless of their status, but only clusters with data retained for up to 90 days after creation can be cloned.
(However, unsupported legacy versions of clusters cannot be cloned.)
- In the Cluster menu, select the [More] icon for the cluster you want to clone > Clone cluster.
- In the Clone cluster popup, verify the information of the cluster to be cloned, choose whether to replicate task scheduling settings, and click [Confirm].
- Task scheduling settings are only visible if the original cluster is Core Hadoop.
Category | Description |
---|---|
Cluster configuration | Cluster version/type/availability |
Cluster availability | Provides Standard and High availability types for operational stability. |
Master node settings | Node instance type, disk volume, hostname |
Worker node settings | Node instance type, disk volume |
Delete cluster
You can delete clusters that are no longer in use.
Deleted cluster resources are fully released after termination and cannot be recovered. HDFS data will also be completely deleted along with the cluster resources and cannot be restored.
- In the Cluster menu, select the [More] icon for the cluster you want to delete > Delete cluster.
- In the Delete cluster popup, verify the cluster to be deleted, enter
permanent delete
, and click [Delete].
API Key
Get new API key
If the cluster type is Core Hadoop
, you can issue a Hadoop Eco cluster API key under the following conditions.
Conditions for issuing Hadoop Eco cluster API key when cluster type is Core Hadoop
- Task scheduling is activated as Hive or Spark.
- The Hadoop Eco cluster status is Terminated(User Command) or Terminated(User).
When the Open API status is active, two cluster status values may appear:
Pending
: The state where Hadoop Eco creation can be requested after Open API activation.Processing
: The state where Hadoop Eco creation and job scheduling are in progress after Open API activation.
-
In the Cluster menu, select the cluster for which you want to issue an API key.
You can search by filtering Open API status or by searching for the desired cluster.Open API status Description Not available Open API authentication not displayed Available Open API authentication displayed In progress Open API authentication not selectable -
Click the [More] icon for the cluster > Open API authentication.
-
In the Open API authentication popup, click [Issue].
-
Copy the issued API key and store it securely.
API keys cannot be rechecked, so be sure to copy and store them securely from the Open API authentication popup. If lost, you can reissue a new key by clicking the [Reissue] button.
- If there is an existing cluster with the same name, an Open API key cannot be issued.
- If Open API cluster creation is no longer needed, click the [Delete] button to delete the API key.
Reissue API key
You can reissue an API key in the Hadoop Eco service.
When an Open API key is reissued, the existing API key will no longer function, and API calls must be updated with the newly issued key.
- In the Cluster menu, click the [More] icon for the target cluster > Open API authentication.
- In the Open API authentication popup, click [Reissue].
- In the Reissue Open API key popup, click [Reissue].
Delete API key
You can delete an API key. Deleting an API key will also remove the Action logs tab from the cluster details page.
- In the Cluster menu, click the [More] icon for the target cluster > Open API authentication.
- In the Open API authentication popup, click [Delete].
- In the Delete Open API key popup, click [Delete].
Adjust worker node count
You can adjust the number of worker nodes.
You can only adjust the node count if the cluster status is Running
.
- In the Cluster menu, click the [More] icon for the target cluster > Adjust worker node count.
- In the Adjust worker node count popup, review the existing node instance type and count, enter a value greater or smaller than the current count, and click [Save].