This section describes the metrics provided by KakaoCloud's Monitoring service.
These are the primary system resource metrics commonly collected from Virtual Machine, GPU, and Bare Metal Server instances, and they can be utilized in the following service areas:
- Monitoring: Custom Dashboard, Metric Explorer, Metric Export
- Alert Center: Metric-based Alert Policy Configuration
- CPU & Memory
- Disk I/O & Capacity
- Network
- GPU Exclusive Metrics
| Metric Name | Description | Unit |
|---|
| cpu_usage | Total CPU utilization | % |
| cpu_usage_user | CPU utilization (user process) | % |
| cpu_usage_system | CPU utilization (system kernel) | % |
| cpu_usage_iowait | CPU utilization (I/O wait) | % |
| cpu_usage_per_core | CPU utilization per core | % |
| mem_usage | Total memory utilization | % |
| mem_used | Amount of memory in use | bytes(IEC) |
| mem_buffered | Memory usage (buffer) | bytes(IEC) |
| mem_cached | Memory usage (cache) | bytes(IEC) |
| Metric Name | Description | Unit |
|---|
| disk_used_percent | Disk utilization | % |
| disk_used | Disk usage | bytes(IEC) |
| disk_read_bytes_persec | Disk read bytes size per second | bytes/s(IEC) |
| disk_write_bytes_persec | Disk write bytes size per second | bytes/s(IEC) |
| disk_read_iops | Disk read operations completed per second (IOPS) | count/s |
| disk_write_iops | Disk write operations completed per second (IOPS) | count/s |
| disk_inodes_usage | Disk inode utilization | % |
| disk_free | Available disk capacity | bytes(IEC) |
| disk_total | Total disk capacity | bytes(IEC) |
| disk_inodes_free | Number of available inodes | count |
| disk_inodes_total | Total number of reserved inodes | count |
| disk_inodes_used | Inode usage | count |
| Metric Name | Description | Unit |
|---|
| network_rx_bytes_persec | Bytes received per second on the network interface | bytes/s(IEC) |
| network_tx_bytes_persec | Bytes sent per second on the network interface | bytes/s(IEC) |
| network_rx_packets_persec | Packets received per second on the network interface | packets/s |
| network_tx_packets_persec | Packets sent per second on the network interface | packets/s |
| Metric Name | Description | Unit |
|---|
| nvidia_smi_utilization_gpu | GPU core utilization | % |
| nvidia_smi_memory_used | Memory in use per GPU core | MiB(IEC) |
| nvidia_smi_memory_free | Available memory per GPU core | MiB(IEC) |
| nvidia_smi_memory_total | Total memory per GPU core | MiB(IEC) |
| nvidia_smi_power_draw | Power consumption per GPU core | watt |
mem_buffered, mem_cached, and disk_inodes_usage metrics are only collected and provided on Linux OS.
nvidia_smi metrics are only collected from servers equipped with a GPU.
- GPU Instance Library Compatibility: If you update the NVIDIA library on a GPU instance, you must check compatibility with the CUDA version. If they are incompatible, the monitoring agent may fail to collect NVIDIA metrics.
- Network Alert Policy: When configuring an Alert Center policy using the
network_rx_bytes_persec metric, the policy will apply to all network interfaces. For multi-NIC instances, an alert is triggered if any connected interface exceeds the configured threshold.
Libvirt Metrics
These are the main resource metrics for virtualization-based servers collected in the Libvirt environment, and they can be utilized in the following service areas:
- Monitoring: Metric Export
- Alert Center: Metric-based Alert Policy Configuration
- CPU & Memory
- Disk I/O & Capacity
- Network
| Metric Name | Description | Unit |
|---|
| libvirt_domain_info_cpu_time_seconds_total | Total CPU time used | count |
| libvirt_domain_info_virtual_cpus | Number of CPU cores | count |
| Metric Name | Description | Unit |
|---|
| libvirt_domain_block_stats_read_bytes_total | Bytes read from disk | bytes(IEC) |
| libvirt_domain_block_stats_write_bytes_total | Bytes written to disk | bytes(IEC) |
| Metric Name | Description | Unit |
|---|
| libvirt_domain_interface_stats_receive_bytes_total | Bytes received on the network interface | bytes(IEC) |
| libvirt_domain_interface_stats_transmit_bytes_total | Bytes sent on the network interface | bytes(IEC) |
| libvirt_domain_interface_stats_receive_packets_total | Packets received on the network interface | packets |
| libvirt_domain_interface_stats_transmit_packets_total | Packets sent on the network interface | packets |
| libvirt_domain_interface_stats_receive_drops_total | Number of packets not received on the network interface | packets |
Burstable Instance Exclusive Metrics
The following metrics are only collected from t1i family instances with the Burstable option applied (excluding t1i.medium.dns.default type).
| Metric Name | Description | Unit |
|---|
| cpu_credit_usage | Accumulated CPU credit usage; the amount of credit consumed when CPU usage exceeds baseline performance | count |
| cpu_credit_balance | Remaining CPU credit balance for the instance, accrued when operating below baseline performance | count |
Kubernetes Engine Metrics
These are the primary cluster, node, and pod resource metrics collected in the Kubernetes Engine environment, and they can be utilized in the following service areas:
- Monitoring: Metric Export
- Cluster
- Node
- Pod & Container
| Metric Name | Description | Unit |
|---|
| cluster_autoscaler_node_group_min_count | Minimum number of nodes during node group autoscaling | count |
| cluster_autoscaler_node_group_max_count | Maximum number of nodes during node group autoscaling | count |
| cluster_autoscaler_node_group_target_count | Target number of nodes during node group autoscaling | count |
| node_count | Current number of nodes | count |
| Metric Name | Description | Unit |
|---|
| kube_node_status_allocatable | Amount of resources allocatable to pods on the node | none |
| kube_node_status_capacity | Total resource capacity of the node | none |
| node_cpu_seconds_total | Time used by node CPU in each mode | count |
| node_filesystem_avail_bytes | Available filesystem capacity in the non-root user area of the node | bytes(IEC) |
| node_filesystem_size_bytes | Total size of the node filesystem | bytes(IEC) |
| node_memory_Active_bytes | Memory currently in use but reusable (Active) | bytes(IEC) |
| node_memory_Buffers_bytes | Memory used by the kernel for disk I/O buffers | bytes(IEC) |
| node_memory_Cached_bytes | Memory used for filesystem cache | bytes(IEC) |
| node_memory_MemAvailable_bytes | Memory immediately available for new processes | bytes(IEC) |
| node_memory_MemFree_bytes | Memory currently unallocated | bytes(IEC) |
| node_memory_MemTotal_bytes | Total memory capacity of the node | bytes(IEC) |
| node_memory_SReclaimable_bytes | Reclaimable memory within the Slab cache | bytes(IEC) |
| node_network_receive_bytes_total | Total bytes received by the network device | bytes(IEC) |
| node_network_transmit_bytes_total | Total bytes sent by the network device | bytes(IEC) |
| Metric Name | Description | Unit |
|---|
| kube_pod_container_info | Basic information about the container within the pod | none |
| kube_pod_container_resource_limits | Upper limit (Limit) of resources requested by the container | none |
| kube_pod_container_resource_requests | Requested resource value (Request) of the container | none |
| kube_pod_container_status_running | Whether the container status is Running | count |
| kube_pod_container_status_terminated | Whether the container status is Terminated | count |
| kube_pod_info | Information about the pod | none |
| container_cpu_usage_seconds_total | Total CPU time consumed by the container | count |
| container_memory_working_set_bytes | Memory used by the container that cannot be reclaimed by the OS | bytes(IEC) |
| container_network_receive_bytes_total | Total network bytes received by the container | bytes(IEC) |
| container_network_transmit_bytes_total | Total network bytes sent by the container | bytes(IEC) |
Load Balancing Metrics
These are the main metrics for monitoring the traffic and connection status of Load Balancer resources, and they can be utilized in the following service areas:
- Monitoring: Custom Dashboard, Metric Explorer, Metric Export
- Alert Center: Metric-based Alert Policy Configuration
| Metric Name | Description | Unit |
|---|
| lb_bytes_in_persec | Inbound traffic per second (received bytes) | bytes/s(IEC) |
| lb_bytes_out_persec | Outbound traffic per second (sent bytes) | bytes/s(IEC) |
| lb_connections_persec | Number of connections created per second | count/s |
| lb_current_connections | Number of currently maintained connections | count |
| lb_healthy_host_count | Number of healthy (connectable) targets | count |
| lb_unhealthy_host_count | Number of unhealthy (unconnectable) targets | count |
MySQL Metrics
These are the main metrics for monitoring the storage, network, query, and connection status of MySQL instances, and they can be utilized in the following service areas:
- Monitoring: Custom Dashboard, Metric Explorer, Metric Export
- Alert Center: Metric-based Alert Policy Configuration
- Memory
- Disk I/O & Capacity
- Network
- Query & Transaction
- Connection & Session Status
- InnoDB & Binary Log
| Metric Name | Description | Unit |
|---|
| mem_swap_total | Total swap memory | bytes(IEC) |
| mem_swap_cached | Cached swap memory | bytes(IEC) |
| mem_swap_free | Available swap memory | bytes(IEC) |
| Metric Name | Description | Unit |
|---|
| mysql_logstorage_disk_write_bytes_persec | Bytes written per second on the log storage disk | bytes/s(IEC) |
| mysql_defaultstorage_disk_write_bytes_persec | Bytes written per second on the default storage disk | bytes/s(IEC) |
| mysql_logstorage_disk_read_bytes_persec | Bytes read per second on the log storage disk | bytes/s(IEC) |
| mysql_defaultstorage_disk_read_bytes_persec | Bytes read per second on the default storage disk | bytes/s(IEC) |
| mysql_logstorage_disk_write_iops | Write operations completed per second on the log storage disk | count/s |
| mysql_defaultstorage_disk_write_iops | Write operations completed per second on the default storage disk | count/s |
| mysql_logstorage_disk_read_iops | Read operations completed per second on the log storage disk | count/s |
| mysql_defaultstorage_disk_read_iops | Read operations completed per second on the default storage disk | count/s |
| mysql_logstorage_disk_used | Log storage disk usage | bytes(IEC) |
| mysql_defaultstorage_disk_used | Default storage disk usage | bytes(IEC) |
| mysql_logstorage_disk_used_percent | Log storage disk utilization | % |
| mysql_defaultstorage_disk_used_percent | Default storage disk utilization | % |
| mysql_logstorage_disk_inodes_usage | Log storage inode utilization | % |
| mysql_defaultstorage_disk_inodes_usage | Default storage inode utilization | % |
| mysql_defaultstorage_disk_free | Available capacity on the default storage disk | bytes(IEC) |
| mysql_defaultstorage_disk_total | Total capacity of the default storage disk | bytes(IEC) |
| mysql_logstorage_disk_free | Available capacity on the log storage disk | bytes(IEC) |
| mysql_logstorage_disk_total | Total capacity of the log storage disk | bytes(IEC) |
| mysql_defaultstorage_disk_inodes_free | Number of available inodes on the default storage disk | count |
| mysql_defaultstorage_disk_inodes_total | Total number of inodes on the default storage disk | count |
| mysql_defaultstorage_disk_inodes_used | Inode usage on the default storage disk | count |
| mysql_logstorage_disk_inodes_free | Number of available inodes on the log storage disk | count |
| mysql_logstorage_disk_inodes_total | Total number of inodes on the log storage disk | count |
| mysql_logstorage_disk_inodes_used | Inode usage on the log storage disk | count |
| Metric Name | Description | Unit |
|---|
| mysql_network_rx_bytes_persec | Bytes received per second on the network interface | bytes/s(IEC) |
| mysql_network_tx_bytes_persec | Bytes sent per second on the network interface | bytes/s(IEC) |
| mysql_network_rx_packets_persec | Packets received per second on the network interface | packets/s |
| mysql_network_tx_packets_persec | Packets sent per second on the network interface | packets/s |
| Metric Name | Description | Unit |
|---|
| mysql_query_persec | Queries per second (QPS) | count/s |
| mysql_com_insert_count | Number of INSERT queries performed over 5 minutes | count |
| mysql_com_select_count | Number of SELECT queries performed over 5 minutes | count |
| mysql_com_update_count | Number of UPDATE queries performed over 5 minutes | count |
| mysql_com_delete_count | Number of DELETE queries performed over 5 minutes | count |
| mysql_com_commit_count | Number of COMMIT queries performed over 5 minutes | count |
| mysql_slow_query_count | Number of slow queries performed over 5 minutes | count |
| Metric Name | Description | Unit |
|---|
| mysql_connections_count | Number of currently connected connections | count |
| mysql_max_connections_count | Maximum number of connectable connections | count |
| mysql_connection_usage_percent | Current connection ratio relative to maximum connections | % |
| mysql_instance_status | Instance status | count |
| mysql_instance_group_status | Instance group status | count |
| mysql_uptime | Instance uptime | duration |
| Metric Name | Description | Unit |
|---|
| mysql_innodb_buffer_pool_read_requests | Total Buffer Pool read requests | count |
| mysql_innodb_buffer_pool_reads | Read requests directly from the Buffer Pool | count |
| mysql_innodb_buffer_cache_hit_ratio | Buffer Pool cache hit ratio | % |
| mysql_innodb_row_lock_current_waits | Number of current row locks | count |
| mysql_innodb_row_lock_time | Time spent on row locks | milliseconds |
| mysql_binary_size_bytes | Binary Log size | bytes(IEC) |
| mysql_binary_files_count | Number of Binary Log files | count |
| mysql_variables_max_binlog_size | Maximum Binary Log size setting | bytes(IEC) |
| mysql_replication_lag | Binlog replication delay time | seconds |
PostgreSQL Metrics
These are the main metrics for monitoring the disk, network, connection, and transaction status of PostgreSQL instances, and they can be utilized in the following service areas:
- Monitoring: Custom Dashboard, Metric Explorer, Metric Export
- Alert Center: Metric-based Alert Policy Configuration
- Disk I/O & Capacity
- Network
- Connection & Session Status
- Replication & Transaction Status
- Performance & Cache
| Metric Name | Description | Unit |
|---|
| pg_defaultstorage_disk_read_bytes_persec | Bytes read per second on the default storage disk | bytes/s(IEC) |
| pg_defaultstorage_disk_write_bytes_persec | Bytes written per second on the default storage disk | bytes/s(IEC) |
| pg_defaultstorage_disk_read_iops | Read operations completed per second on the default storage disk | count/s |
| pg_defaultstorage_disk_write_iops | Write operations completed per second on the default storage disk | count/s |
| pg_defaultstorage_disk_used | Default storage disk usage | bytes(IEC) |
| pg_defaultstorage_disk_used_percent | Default storage disk utilization | % |
| pg_defaultstorage_disk_inodes_usage | Default storage inode utilization | % |
| pg_defaultstorage_disk_free | Available capacity on the default storage disk | bytes(IEC) |
| pg_defaultstorage_disk_total | Total capacity of the default storage disk | bytes(IEC) |
| pg_defaultstorage_disk_inodes_free | Number of available inodes on the default storage disk | count |
| pg_defaultstorage_disk_inodes_total | Total number of inodes on the default storage disk | count |
| pg_defaultstorage_disk_inodes_used | Inode usage on the default storage disk | count |
| pg_logstorage_disk_read_bytes_persec | Bytes read per second on the log storage disk | bytes/s(IEC) |
| pg_logstorage_disk_write_bytes_persec | Bytes written per second on the log storage disk | bytes/s(IEC) |
| pg_logstorage_disk_read_iops | Read operations completed per second on the log storage disk | count/s |
| pg_logstorage_disk_write_iops | Write operations completed per second on the log storage disk | count/s |
| pg_logstorage_disk_used | Log storage disk usage | bytes(IEC) |
| pg_logstorage_disk_used_percent | Log storage disk utilization | % |
| pg_logstorage_disk_inodes_usage | Log storage inode utilization | % |
| pg_logstorage_disk_free | Available capacity on the log storage disk | bytes(IEC) |
| pg_logstorage_disk_total | Total capacity of the log storage disk | bytes(IEC) |
| pg_logstorage_disk_inodes_free | Number of available inodes on the log storage disk | count |
| pg_logstorage_disk_inodes_total | Total number of inodes on the log storage disk | count |
| pg_logstorage_disk_inodes_used | Inode usage on the log storage disk | count |
| Metric Name | Description | Unit |
|---|
| pg_network_rx_bytes_persec | Bytes received per second on the network interface | bytes/s(IEC) |
| pg_network_tx_bytes_persec | Bytes sent per second on the network interface | bytes/s(IEC) |
| pg_network_rx_packets_persec | Packets received per second on the network interface | packets/s |
| pg_network_tx_packets_persec | Packets sent per second on the network interface | packets/s |
| Metric Name | Description | Unit |
|---|
| pg_total_connections | Total number of connections | count |
| pg_active_connections | Number of active connections | count |
| pg_active_transactions | Number of active transactions | count |
| pg_lock_sessions | Number of sessions experiencing locks | count |
| pg_total_deadlocks | Number of deadlocks occurred | count |
| Metric Name | Description | Unit |
|---|
| pg_replication_lag | Replication delay time | seconds |
| pg_temp_file_ratio | Temporary file ratio relative to total transactions | % |
| pg_temp_file_ratio_per_group | Temporary file ratio per instance group | % |
| pg_xid_age | XID Age of a specific process | count |
| pg_xid_age_per_group | Vacuum XID Age per instance group | count |
| Metric Name | Description | Unit |
|---|
| pg_buffer_hit_ratio | Buffer hit ratio | % |
MemStore Metrics
These are the main metrics for monitoring the memory, network, replication, and CPU usage status of MemStore instances, and they can be utilized in the following service areas:
- Monitoring: Custom Dashboard, Metric Explorer, Metric Export
- Alert Center: Metric-based Alert Policy Configuration
- CPU & Memory
- Disk I/O & Capacity
- Network
- Connection & Replication Status
- Keys & Cache Statistics
| Metric Name | Description | Unit |
|---|
| memstore_used_cpu_sys | Total system CPU usage | count |
| memstore_used_cpu_sys_main_thread | System CPU usage of the main thread | count |
| memstore_used_cpu_user | Total user CPU usage | count |
| memstore_used_cpu_user_main_thread | User CPU usage of the main thread | count |
| memstore_memory_usage | Total memory utilization | % |
| memstore_used_memory | Size of memory used by MemStore | bytes(IEC) |
| memstore_used_memory_peak | Peak memory usage | bytes(IEC) |
| memstore_used_memory_peak_perc | Peak usage ratio relative to total memory | % |
| memstore_used_memory_dataset | Memory used for actual data storage | bytes(IEC) |
| memstore_used_memory_dataset_perc | Memory ratio used for actual data storage | % |
| memstore_used_memory_overhead | Overhead memory required for internal data structure management | bytes(IEC) |
| memstore_used_memory_lua | Memory used for Lua script execution | bytes(IEC) |
| memstore_allocator_allocated | Memory allocated to the allocator (including internal fragmentation) | bytes(IEC) |
| memstore_allocator_active | Active memory in the allocator (including external fragmentation) | bytes(IEC) |
| memstore_allocator_resident | Resident memory managed by the allocator | bytes(IEC) |
| memstore_allocator_rss_bytes | RSS memory size | bytes(IEC) |
| memstore_allocator_frag_bytes | Difference between active memory and allocated memory | bytes(IEC) |
| memstore_allocator_frag_ratio | Ratio of allocated memory to active memory | % |
| memstore_allocator_rss_ratio | Ratio of active memory to resident memory | % |
| memstore_mem_fragmentation_bytes | Difference between used resident memory and allocated memory | bytes(IEC) |
| memstore_mem_fragmentation_ratio | Ratio of used resident memory to allocated memory | % |
| memstore_rss_overhead_bytes | Difference between process RSS and allocator resident memory | bytes(IEC) |
| memstore_rss_overhead_ratio | Ratio between process RSS and allocator resident memory | % |
| memstore_total_system_memory | Total memory of the system where MemStore is running | bytes(IEC) |
| Metric Name | Description | Unit |
|---|
| disk_free | Available disk capacity | bytes(IEC) |
| disk_total | Total disk capacity | bytes(IEC) |
| disk_inodes_free | Number of available inodes | count |
| disk_inodes_used | Number of inodes in use | count |
| disk_inodes_total | Total number of reserved inodes | count |
| Metric Name | Description | Unit |
|---|
| memstore_instantaneous_input_kbps | Network input rate per second | KiB/s(IEC) |
| memstore_instantaneous_output_kbps | Network output rate per second | KiB/s(IEC) |
| memstore_total_net_input_bytes | Total network input bytes | bytes(IEC) |
| memstore_total_net_output_bytes | Total network output bytes | bytes(IEC) |
| memstore_instantaneous_ops_per_sec | Number of commands processed per second | count |
| memstore_cmdstat_calls_persec | Command calls per second | count/s |
| memstore_total_commands_processed | Total number of commands processed | count |
| memstore_total_reads_processed | Total number of read events processed | count |
| memstore_total_writes_processed | Total number of write events processed | count |
| memstore_io_threaded_reads_processed | Number of read events processed by I/O threads | count |
| memstore_io_threaded_writes_processed | Number of write events processed by I/O threads | count |
| Metric Name | Description | Unit |
|---|
| memstore_connected_slaves | Number of connected Replicas | count |
| memstore_replication_lag | Replica replication delay time | seconds |
| memstore_clients | Number of currently connected clients | count |
| memstore_maxclients | Maximum connectable clients | count |
| memstore_client_ratio | Current connection ratio relative to maximum clients | % |
| memstore_blocked_clients | Number of blocked clients (e.g., BLPOP commands) | count |
| memstore_cluster_connections | Number of sockets used on the cluster bus | count |
| memstore_cluster_enabled | Cluster enabled status | count |
| memstore_pubsub_channels | Number of Pub/Sub channels | count |
| memstore_pubsub_patterns | Number of Pub/Sub patterns | count |
| memstore_uptime | Instance uptime | seconds |
| Metric Name | Description | Unit |
|---|
| memstore_keyspace_hits | Number of key hits | count |
| memstore_keyspace_misses | Number of key misses | count |
| memstore_keyspace_hitrate_percent | Key hit rate | % |
| memstore_evicted_keys | Number of keys removed due to memory limits | count |
| memstore_expired_keys | Number of expired keys | count |
| memstore_lazyfree_pending_objects | Number of objects pending release by Lazy Free | count |
| memstore_lazyfreed_objects | Number of objects released by Lazy Free | count |
| memstore_lru_clock | Internal time value for the LRU algorithm | count |
Burstable Instance Exclusive Metrics
The following metrics are only collected from t1i family instances with the Burstable option applied.
| Metric Name | Description | Unit |
|---|
| cpu_credit_usage | CPU credit usage | count |
| cpu_credit_balance | CPU credit balance | count |
Hadoop Eco Metrics
These are the main HBase, HDFS, Yarn, and Kafka related system metrics collected in the Hadoop Eco environment, and they can be utilized in the following service areas:
- Monitoring: Metric Export
- HBase
- HDFS (NameNode)
- Yarn (ResourceManager)
- Kafka
| Metric Name | Description | Unit |
|---|
| HBase_Master_JvmMetrics_MemHeapMaxM | Maximum JVM heap memory size of HBase Master | MB |
| HBase_Master_JvmMetrics_MemHeapUsedM | JVM heap memory usage of HBase Master | MB |
| HBase_Master_Server_numDeadRegionServers | Number of dead (unhealthy) Region Servers | count |
| HBase_Master_Server_numRegionServers | Number of running (healthy) Region Servers | count |
| Metric Name | Description | Unit |
|---|
| Hadoop_NameNode_JvmMetrics_MemHeapMaxM | Maximum JVM heap memory size of NameNode | MB |
| Hadoop_NameNode_JvmMetrics_MemHeapUsedM | JVM heap memory usage of NameNode | MB |
| Hadoop_NameNode_JvmMetrics_GcTimeMillis | GC execution time of NameNode JVM | count |
| Hadoop_NameNode_FSNamesystem_CapacityTotal | Total HDFS storage capacity | bytes(IEC) |
| Hadoop_NameNode_FSNamesystem_CapacityUsed | Used HDFS storage capacity | bytes(IEC) |
| Hadoop_NameNode_FSNamesystem_CapacityRemaining | Remaining HDFS storage capacity | bytes(IEC) |
| Hadoop_NameNode_FSNamesystem_CapacityUsedNonDFS | Capacity used for non-DFS purposes (logs, temporary files, etc.) | bytes(IEC) |
| Hadoop_NameNode_FSNamesystem_NumLiveDataNodes | Number of healthy (live) DataNodes | count |
| Hadoop_NameNode_FSNamesystem_NumDeadDataNodes | Number of dead (unhealthy) DataNodes | count |
| Hadoop_NameNode_FSNamesystem_StaleDataNodes | Number of DataNodes with halted status updates | count |
| Hadoop_NameNode_FSNamesystem_CorruptBlocks | Number of corrupted blocks | count |
| Hadoop_NameNode_FSNamesystem_TotalLoad | Number of currently active client connections | count |
| Metric Name | Description | Unit |
|---|
| Yarn_ResourceManager_JvmMetrics_MemHeapMaxM | Maximum JVM heap memory size of ResourceManager | MB |
| Yarn_ResourceManager_JvmMetrics_MemHeapUsedM | JVM heap memory usage of ResourceManager | MB |
| Yarn_ResourceManager_JvmMetrics_GcTimeMillis | GC execution time of ResourceManager JVM | count |
| Yarn_ResourceManager_ClusterMetrics_NumActiveNMs | Number of healthy (active) NodeManagers | count |
| Yarn_ResourceManager_ClusterMetrics_NumShutdownNMs | Number of shut down NodeManagers | count |
| Yarn_ResourceManager_QueueMetrics_AllocatedMB | Allocated memory size | MB |
| Yarn_ResourceManager_QueueMetrics_AvailableMB | Available memory size | MB |
| Yarn_ResourceManager_QueueMetrics_PendingMB | Pending memory size | MB |
| Yarn_ResourceManager_QueueMetrics_ReservedMB | Reserved memory size | MB |
| Yarn_ResourceManager_QueueMetrics_AllocatedVCores | Number of allocated vCores | count |
| Yarn_ResourceManager_QueueMetrics_AvailableVCores | Number of available vCores | count |
| Yarn_ResourceManager_QueueMetrics_PendingVCores | Number of pending vCores | count |
| Yarn_ResourceManager_QueueMetrics_ReservedVCores | Number of reserved vCores | count |
| Yarn_ResourceManager_QueueMetrics_AppsRunning | Number of running applications | count |
| Yarn_ResourceManager_QueueMetrics_AppsCompleted | Number of completed applications | count |
| Yarn_ResourceManager_QueueMetrics_AppsFailed | Number of failed applications | count |
| Yarn_ResourceManager_QueueMetrics_AppsKilled | Number of killed applications | count |
| Yarn_ResourceManager_QueueMetrics_AppsPending | Number of pending applications | count |
| Yarn_ResourceManager_QueueMetrics_AppsSubmitted | Number of submitted applications | count |
| Metric Name | Description | Unit |
|---|
| Kafka_Active_Brokers | Number of healthy (active) Brokers | count |
| Kafka_Total_Topics | Total number of Topics in operation | count |
Pub/Sub Metrics
These are the main metrics for monitoring the message publishing, subscription, and storage status of the Pub/Sub service, and they can be utilized in the following service areas:
- Monitoring: Custom Dashboard, Metric Explorer
- Alert Center: Metric-based Alert Policy Configuration
- Publish
- Subscription
- Storage & Export
| Metric Name | Description | Unit |
|---|
| pubsub_published_message_count_persec | Number of published messages per second | count/s |
| pubsub_published_message_bytes_persec | Size of published messages per second | bytes/s(IEC) |
| pubsub_publish_request_count_persec | Number of publish requests per second | count/s |
| pubsub_topic_storage_used_bytes | Topic retention data size | bytes(IEC) |
| Metric Name | Description | Unit |
|---|
| pubsub_ack_request_count_persec | Number of acknowledgment (ACK) requests per second | count/s |
| pubsub_acked_message_count_persec | Number of acknowledged messages per second | count/s |
| pubsub_unprocessed_messages | Number of unprocessed messages | count |
| pubsub_pulled_message_count_persec | Number of pulled messages per second | count/s |
| pubsub_streaming_pull_response_count_persec | Number of streaming pull responses per second | count/s |
| pubsub_push_count_persec | Number of push requests per second | count/s |
| pubsub_pushed_message_count_persec | Number of pushed messages per second | count/s |
| pubsub_subscription_storage_used_bytes | Subscription retention data size | bytes(IEC) |
| pubsub_seek_request_count_permin | Number of Seek requests per minute | count/m |
| Metric Name | Description | Unit |
|---|
| pubsub_exported_message_count_persec | Number of messages exported to Object Storage per second | count/s |
| pubsub_object_storage_api_call_count_permin | Number of Object Storage API calls per minute | count/m |
Direct Connect Metrics
These are the main metrics for monitoring the traffic and connection status of Direct Connect virtual interfaces, and they can be utilized in the following service areas:
- Monitoring: Metric Export
- Network Traffic
- Connection Status
| Metric Name | Description | Unit |
|---|
| dx_virtual_interface_input_bits_persec | Bits received per second on the virtual interface | bits/s(IEC) |
| dx_virtual_interface_output_bits_persec | Bits sent per second on the virtual interface | bits/s(IEC) |
| dx_virtual_interface_input_packets_persec | Packets received per second on the virtual interface | packets/s |
| dx_virtual_interface_output_packets_persec | Packets sent per second on the virtual interface | packets/s |
| Metric Name | Description | Unit |
|---|
| dx_virtual_intrerface_state | Connection status of the Direct Connect virtual interface | count |
Gateway Load Balancer Metrics
These are the main metrics for monitoring the traffic, connection, and health status of Gateway Load Balancer and Endpoint Service, and they can be utilized in the following service areas:
- Monitoring: Metric Export
- Network Traffic
- Connection Status
| Metric Name | Description | Unit |
|---|
| gwlb_bytes_in_persec | Total bytes received by the Gateway Load Balancer | bytes/s(IEC) |
| gwlb_bytes_out_persec | Total bytes sent by the Gateway Load Balancer | bytes/s(IEC) |
| eps_bytes_in_persec | Total bytes received by the Endpoint Service | bytes/s(IEC) |
| eps_bytes_out_persec | Total bytes sent by the Endpoint Service | bytes/s(IEC) |
| ep_bytes_in_persec | Total bytes received by the Endpoint | bytes/s(IEC) |
| ep_bytes_out_persec | Total bytes sent by the Endpoint | bytes/s(IEC) |
| Metric Name | Description | Unit |
|---|
| gwlb_current_connections | Number of active connections on the Gateway Load Balancer | count |
| gwlb_healthy_host_count | Number of healthy hosts on the Gateway Load Balancer | count |
| gwlb_unhealthy_host_count | Number of unhealthy hosts on the Gateway Load Balancer | count |
| eps_current_connections | Number of active connections on the Endpoint Service | count |
| eps_endpoint_count | Number of endpoints connected to the Endpoint Service | count |
| ep_current_connections | Number of active connections on the Endpoint | count |
Private Endpoint Metrics
These are the main metrics for monitoring the traffic and connection status of Private Endpoint, and they can be utilized in the following service areas:
- Monitoring: Metric Export
- Network Traffic
- Connection Status
| Metric Name | Description | Unit |
|---|
| ep_bytes_in_persec | Total bytes received by the Endpoint | bytes/s(IEC) |
| ep_bytes_out_persec | Total bytes sent by the Endpoint | bytes/s(IEC) |
| Metric Name | Description | Unit |
|---|
| ep_current_connections | Number of active connections on the Endpoint | count |