Monitoring metrics
Explains the metrics provided by Monitoring service in KakaoCloud.
These metrics are used in Custom dashboards, Metric Explorer, Metric Export, and Alert Center for metric-based alert policies provided by Monitoring.
Virtual Machine, GPU, Bare Metal Server Metrics
Monitoring - Custom Dashboard, Metric Explorer, Metric Export, Alert Center - Alert Policies (Metric-based)
The metrics mem_buffered
, mem_cached
, and disk_inodes_usage
are collected and provided only on servers with Linux OS.
The nvidia_smi
metric is only collected on servers with a GPU.
When updating the NVIDIA library on GPU instances, please ensure compatibility between the library version and CUDA version.
If version compatibility issues arise due to updates via apt upgrade
or similar methods, the monitoring agent installed by the user may fail to collect metrics related to Nvidia.
If an alert policy is set for the network_rx_bytes_persec
metric in Alert Center, it applies to the entire network interface. That is, if the instance has multiple network interfaces, an alarm will be triggered if any of the connected network interfaces exceed the set threshold.
Metric Name | Description | Unit |
---|---|---|
cpu_usage | Measures the overall CPU usage | % |
cpu_usage_iowait | CPU usage, CPU state: iowait | % |
cpu_usage_system | CPU usage, CPU state: system | % |
cpu_usage_user | CPU usage, CPU state: user | % |
cpu_usage_per_core | Measures the CPU usage per core | % |
mem_buffered | Memory usage, memory state: buffered | bytes(IEC) |
mem_cached | Memory usage, memory state: cached | bytes(IEC) |
mem_used | Memory usage | bytes(IEC) |
mem_usage | Memory usage percentage | % |
disk_used | Disk usage | bytes(IEC) |
disk_used_percent | Disk usage percentage | % |
disk_inodes_usage | Disk inode usage percentage | % |
disk_read_bytes_persec | Bytes read per second from disk | bytes/s(IEC) |
disk_write_bytes_persec | Bytes written per second to disk | bytes/s(IEC) |
disk_read_iops | Disk I/O operations (reads) per second | count/s |
disk_write_iops | Disk I/O operations (writes) per second | count/s |
network_rx_bytes_persec | Bytes received per second by network interface | bytes/s(IEC) |
network_tx_bytes_persec | Bytes transmitted per second by network interface | bytes/s(IEC) |
network_rx_packets_persec | Packets received per second by network interface | packets/s |
network_tx_packets_persec | Packets transmitted per second by network interface | packets/s |
nvidia_smi_memory_free | Free memory per GPU core | MiB(IEC) |
nvidia_smi_memory_total | Total memory per GPU core | MiB(IEC) |
nvidia_smi_memory_used | Used memory per GPU core | MiB(IEC) |
nvidia_smi_power_draw | Power draw per GPU core | watt |
nvidia_smi_utilization_gpu | GPU usage percentage per core | % |
Libvirt metrics
Monitoring - Metric Export
Metric Name | Description | Unit |
---|---|---|
libvirt_domain_info_cpu_time_seconds_total | Total CPU time used | count |
libvirt_domain_info_virtual_cpus | Number of CPU cores | count |
libvirt_domain_block_stats_read_bytes_total | Total bytes read from disk | bytes(IEC) |
libvirt_domain_block_stats_write_bytes_total | Total bytes written to disk | bytes(IEC) |
libvirt_domain_interface_stats_receive_bytes_total | Total bytes received by network interface | bytes(IEC) |
libvirt_domain_interface_stats_transmit_bytes_total | Total bytes transmitted by network interface | bytes(IEC) |
The metrics cpu_credit_usage
and cpu_credit_balance
are only collected on t1i
servers with Burstable option enabled (*except for t1i.medium.dns.default).
Metric Name | Description | Unit |
---|---|---|
cpu_credit_usage | CPU credit usage | count |
cpu_credit_balance | Remaining CPU credits | count |
Kubernetes Engine Metrics
Monitoring - Metric Export
Metric Name | Description | Unit |
---|---|---|
cluster_autoscaler_node_group_min_count | Minimum node count during node group autoscaling | count |
cluster_autoscaler_node_group_max_count | Maximum node count during node group autoscaling | count |
cluster_autoscaler_node_group_target_count | Target node count during node group autoscaling | count |
kube_node_status_allocatable | Amount of resources allocatable to pods on the node | none |
kube_node_status_capacity | Total available resources on the node | none |
node_cpu_seconds_total | Time spent in each CPU mode on node | count |
node_filesystem_avail_bytes | Available file system space for non-root users | bytes(IEC) |
node_filesystem_size_bytes | File system size on node | bytes(IEC) |
node_memory_Active_bytes | Active memory that is currently used or reusable | bytes(IEC) |
node_memory_Buffers_bytes | Buffer memory used by the kernel for disk block I/O | bytes(IEC) |
node_memory_Cached_bytes | Memory used for file system cache | bytes(IEC) |
node_memory_MemAvailable_bytes | Estimated available memory that can be used by new processes | bytes(IEC) |
node_memory_MemFree_bytes | Currently unused memory | bytes(IEC) |
node_memory_MemTotal_bytes | Total memory on the node | bytes(IEC) |
node_memory_SReclaimable_bytes | Reclaimable memory in slab cache | bytes(IEC) |
node_network_receive_bytes_total | Total bytes received by node's network devices | bytes(IEC) |
node_network_transmit_bytes_total | Total bytes transmitted by node's network devices | bytes(IEC) |
kube_pod_container_info | Information about containers within a pod | none |
kube_pod_container_resource_limits | Resource limits requested by the container | none |
kube_pod_container_resource_requests | Resource values requested by the container | none |
kube_pod_container_status_running | Whether the container is in Running state | count |
kube_pod_container_status_terminated | Whether the container is in Terminated state | count |
kube_pod_info | Information about the pod | none |
container_cpu_usage_seconds_total | Total CPU time consumed by the container | count |
container_memory_working_set_bytes | Memory used by the container that cannot be reclaimed by the OS | bytes(IEC) |
container_network_receive_bytes_total | Total network bytes received by the container | bytes(IEC) |
container_network_transmit_bytes_total | Total network bytes transmitted by the container | bytes(IEC) |
Load Balancing Metrics
Monitoring - Custom Dashboard, Metric Explorer, Metric Export, Alert Center - Alert Policies (Metric-based)
Metric Name | Description | Unit |
---|---|---|
lb_bytes_in_persec | Inbound traffic | bytes/s(IEC) |
lb_bytes_out_persec | Outbound traffic | bytes/s(IEC) |
lb_connections_persec | Number of connections per second | count/s |
lb_current_connections | Number of connected connections | count |
MySQL Metrics
Monitoring - Custom Dashboard, Metric Explorer, Metric Export, Alert Center - Alert Policies (Metric-based)
Metric Name | Description | Unit |
---|---|---|
mem_swap_total | Total swap memory | bytes(IEC) |
mem_swap_cached | Cached swap memory | bytes(IEC) |
mem_swap_free | Free swap memory | bytes(IEC) |
mysql_logstorage_disk_write_bytes_persec | Bytes written per second to the log storage disk | bytes/s(IEC) |
mysql_defaultstorage_disk_write_bytes_persec | Bytes written per second to the default storage disk | bytes/s(IEC) |
mysql_logstorage_disk_read_bytes_persec | Bytes read per second from the log storage disk | bytes/s(IEC) |
mysql_defaultstorage_disk_read_bytes_persec | Bytes read per second from the default storage disk | bytes/s(IEC) |
mysql_logstorage_disk_write_iops | Write operations completed per second on log storage disk | count/s |
mysql_defaultstorage_disk_write_iops | Write operations completed per second on default storage disk | count/s |
mysql_logstorage_disk_read_iops | Read operations completed per second on log storage disk | count/s |
mysql_defaultstorage_disk_read_iops | Read operations completed per second on default storage disk | count/s |
mysql_logstorage_disk_used | Log storage disk usage | bytes(IEC) |
mysql_defaultstorage_disk_used | Default storage disk usage | bytes(IEC) |
mysql_defaultstorage_disk_used_percent | Default storage disk usage percentage | % |
mysql_logstorage_disk_used_percent | Log storage disk usage percentage | % |
mysql_logstorage_disk_inodes_usage | Log storage inode usage percentage | % |
mysql_defaultstorage_disk_inodes_usage | Default storage inode usage percentage | % |
mysql_network_rx_bytes_persec | Bytes received per second by network interface | bytes/s(IEC) |
mysql_network_tx_bytes_persec | Bytes transmitted per second by network interface | bytes/s(IEC) |
mysql_network_rx_packets_persec | Packets received per second by network interface | packets/s |
mysql_network_tx_packets_persec | Packets transmitted per second by network interface | packets/s |
mysql_innodb_row_lock_current_waits | Number of current row lock waits | count |
mysql_binary_size_bytes | Size of binary logs | bytes(IEC) |
mysql_binary_files_count | Number of binary log files | count |
mysql_variables_max_binlog_size | Maximum binary log size | bytes(IEC) |
mysql_connections_count | Number of connected connections | count |
mysql_slow_query_count | Number of slow queries in the last 5 minutes | count |
mysql_com_insert_count | Number of INSERT queries in the last 5 minutes | count |
mysql_com_select_count | Number of SELECT queries in the last 5 minutes | count |
mysql_com_delete_count | Number of DELETE queries in the last 5 minutes | count |
mysql_com_commit_count | Number of COMMIT queries in the last 5 minutes | count |
mysql_com_update_count | Number of UPDATE queries in the last 5 minutes | count |
mysql_query_persec | Queries per second (QPS) | count/s |
mysql_connection_usage_percent | Percentage of connections used compared to max connections | % |
mysql_innodb_buffer_pool_read_requests | Total buffer pool read requests | count |
mysql_innodb_row_lock_time | Row lock time | milliseconds |
mysql_innodb_buffer_pool_reads | Number of buffer pool reads | count |
mysql_innodb_buffer_cache_hit_ratio | MySQL InnoDB buffer pool cache hit ratio | % |
mysql_uptime | Uptime duration | duration |
mysql_instance_status | Instance status | count |
mysql_instance_group_status | Instance group status | count |
mysql_replication_lag | Binlog replication lag | seconds |
mysql_max_connections_count | Maximum number of connections | count |
PostgreSQL Metrics
Monitoring - Custom Dashboard, Metric Explorer, Metric Export, Alert Center - Alert Policies (Metric-based)
Metric Name | Description | Unit |
---|---|---|
pg_active_connections | Number of active PostgreSQL connections | count |
pg_active_transactions | Number of active PostgreSQL transactions | count |
pg_buffer_hit_ratio | PostgreSQL buffer hit ratio | % |
pg_defaultstorage_disk_inodes_usage | Default storage inode usage percentage | % |
pg_defaultstorage_disk_read_bytes_persec | Bytes read per second from default storage disk | bytes/s(IEC) |
pg_defaultstorage_disk_read_iops | Read operations per second on default storage disk | count/s |
pg_defaultstorage_disk_used | Default storage disk usage | bytes(IEC) |
pg_defaultstorage_disk_used_percent | Default storage disk usage percentage | % |
pg_defaultstorage_disk_write_bytes_persec | Bytes written per second to default storage disk | bytes/s(IEC) |
pg_defaultstorage_disk_write_iops | Write operations per second on default storage disk | count/s |
pg_lock_sessions | Number of lock sessions in PostgreSQL | count |
pg_logstorage_disk_inodes_usage | Log storage inode usage percentage | % |
pg_logstorage_disk_read_bytes_persec | Bytes read per second from log storage disk | bytes/s(IEC) |
pg_logstorage_disk_read_iops | Read operations per second from log storage disk | count/s |
pg_logstorage_disk_used | Log storage disk usage | bytes(IEC) |
pg_logstorage_disk_used_percent | Log storage disk usage percentage | % |
pg_logstorage_disk_write_bytes_persec | Bytes written per second to log storage disk | bytes/s(IEC) |
pg_logstorage_disk_write_iops | Write operations per second to log storage disk | count/s |
pg_network_rx_bytes_persec | Bytes received per second by network interface | bytes/s(IEC) |
pg_network_rx_packets_persec | Packets received per second by network interface | packets/s |
pg_network_tx_bytes_persec | Bytes transmitted per second by network interface | bytes/s(IEC) |
pg_network_tx_packets_persec | Packets transmitted per second by network interface | packets/s |
pg_replication_lag | PostgreSQL replication lag | seconds |
pg_temp_file_ratio_per_group | PostgreSQL temporary file usage per instance group | % |
pg_total_connections | Total number of PostgreSQL connections | count |
pg_total_deadlocks | Total number of deadlocks in PostgreSQL | count |
pg_xid_age_per_group | PostgreSQL vacuum xid per instance group | count |
MemStore Metrics
Monitoring - Custom Dashboard, Metric Explorer, Metric Export, Alert Center - Alert Policies (Metric-based)
Metric Name | Description | Unit |
---|---|---|
memstore_allocator_rss_bytes | RSS memory size | bytes(IEC) |
memstore_clients | Number of connected clients | count |
memstore_connected_slaves | Number of connected replicas | count |
memstore_evicted_keys | Number of evicted keys due to maxmemory limit | count |
memstore_expired_keys | Number of expired keys | count |
memstore_instantaneous_ops_per_sec | Commands processed per second | count |
memstore_client_ratio | Ratio of current clients to max clients | % |
memstore_memory_usage | Memory usage percentage in MemStore instance | % |
memstore_keyspace_hits | Number of keyspace hits | count |
memstore_keyspace_misses | Number of keyspace misses | count |
memstore_maxclients | Maximum number of clients allowed | count |
memstore_maxmemory | Maximum available memory | bytes(IEC) |
memstore_replication_lag | Replication lag time | s |
memstore_uptime | Uptime duration | s |
memstore_used_memory | Used memory in MemStore | bytes(IEC) |
memstore_cmdstat_calls_persec | Number of command calls per second | count/s |
memstore_keyspace_hitrate_percent | Keyspace hit rate percentage | % |
memstore_lru_clock | LRU (Least Recently Used) time value for managing algorithm | count |
memstore_blocked_clients | Number of clients waiting due to BLPOP, BRPOP, etc. | count |
memstore_cluster_connections | Number of sockets used by the cluster bus | count |
memstore_allocator_active | Active memory in allocator, including external fragmentation | bytes(IEC) |
memstore_allocator_allocated | Allocated memory in allocator, including internal fragmentation | bytes(IEC) |
memstore_allocator_resident | Resident memory managed by allocator, including OS-returnable memory | bytes(IEC) |
memstore_allocator_frag_bytes | Difference between activated memory and allocated memory | bytes(IEC) |
memstore_allocator_frag_ratio | Ratio between activated memory and allocated memory | % |
memstore_allocator_rss_ratio | Ratio between resident memory and activated memory | % |
memstore_lazyfree_pending_objects | Number of objects waiting to be freed due to UNLINK calls or ASYNC option | count |
memstore_lazyfreed_objects | Number of objects freed via Lazy Free process | count |
memstore_mem_fragmentation_bytes | Difference between used resident memory and allocated memory | bytes(IEC) |
memstore_mem_fragmentation_ratio | Ratio between used resident memory and allocated memory | % |
memstore_mem_not_counted_for_evict | Memory not counted for eviction due to temporary replicas and AOF buffers | bytes(IEC) |
memstore_rss_overhead_bytes | Difference between MemStore process resident memory and allocator's resident memory | bytes(IEC) |
memstore_rss_overhead_ratio | Ratio between MemStore process resident memory and allocator's resident memory | % |
memstore_total_system_memory | Total system memory where MemStore is running | bytes(IEC) |
memstore_used_memory_dataset | Memory used for actual data storage, considering overhead memory | bytes(IEC) |
memstore_used_memory_dataset_perc | Percentage of memory used for actual data storage | % |
memstore_used_memory_lua | Memory used by Lua engine for script execution | bytes(IEC) |
memstore_used_memory_overhead | Memory needed for managing internal data structures | bytes(IEC) |
memstore_used_memory_peak | Maximum memory used by MemStore | bytes(IEC) |
memstore_used_memory_peak_perc | Percentage of maximum memory used relative to total memory | % |
memstore_used_memory_rss | Resident set size memory assigned by the OS | bytes(IEC) |
memstore_instantaneous_input_kbps | Data read from network per second | KiB/s(IEC) |
memstore_instantaneous_output_kbps | Data written to network per second | KiB/s(IEC) |
memstore_io_threaded_reads_processed | Total number of read events processed by main and I/O threads | count |
memstore_io_threaded_writes_processed | Total number of write events processed by main and I/O threads | count |
memstore_pubsub_channels | Number of pub/sub channels subscribed by clients | count |
memstore_pubsub_patterns | Number of pub/sub patterns subscribed by clients | count |
memstore_total_commands_processed | Total number of commands processed by the server | count |
memstore_total_connections_received | Total number of connections accepted by the server | count |
memstore_total_error_replies | Total number of error replies (sum of rejected |
and failed commands) | count | | memstore_total_net_input_bytes | Total network input bytes | bytes(IEC) | | memstore_total_net_output_bytes | Total network output bytes | bytes(IEC) | | memstore_total_reads_processed | Total number of read events processed | count | | memstore_total_writes_processed | Total number of write events processed | count | | memstore_used_cpu_sys | System CPU used by all threads (main and background) of the server | count | | memstore_used_cpu_sys_main_thread | System CPU used by the main thread of the server | count | | memstore_used_cpu_user | User CPU used by all threads (main and background) of the server | count | | memstore_used_cpu_user_main_thread | User CPU used by the main thread of the server | count | | memstore_cluster_enabled | Whether the cluster is enabled | count |
The metrics cpu_credit_usage
and cpu_credit_balance
are only collected on clusters where the flavor is set to t1i
.
Metric Name | Description | Unit |
---|---|---|
cpu_credit_usage | CPU credit usage | count |
cpu_credit_balance | Remaining CPU credits | count |
Pub/Sub Metrics
Monitoring - Custom Dashboard, Metric Explorer, Alert Center - Alert Policies (Metric-based)
Metric Name | Description | Unit |
---|---|---|
pubsub_published_message_count_persec | Messages published per second | count/s |
pubsub_published_message_bytes_persec | Bytes of messages published per second | bytes/s(IEC) |
pubsub_publish_request_count_persec | Publish requests per second | count/s |
pubsub_topic_storage_used_bytes | Topic storage used | bytes(IEC) |
pubsub_seek_request_count_permin | Seek requests per 5 minutes | count |
pubsub_ack_request_count_persec | Acknowledgment requests per second | count/s |
pubsub_acked_message_count_persec | Acknowledged messages per second | count/s |
pubsub_unprocessed_messages | Number of unprocessed messages | count |
pubsub_pulled_message_count_persec | Messages pulled per second | count/s |
pubsub_streaming_pull_response_count_persec | Streaming pull responses per second | count/s |
pubsub_push_count_persec | Push requests per second | count/s |
pubsub_pushed_message_count_persec | Pushed messages per second | count/s |
pubsub_subscription_storage_used_bytes | Subscription storage used | bytes(IEC) |
pubsub_exported_message_count_persec | Messages exported to Object Storage per second | count/s |
pubsub_object_storage_api_call_count_permin | Object Storage API calls per minute | count/m |
Direct Connect Metrics
Monitoring - Metric Export
Metric Name | Description | Unit |
---|---|---|
dx_virtual_intrerface_state | Direct Connect virtual interface state | count |
dx_virtual_interface_output_packets_persec | Packets transmitted per second by Direct Connect virtual interface | packets/s |
dx_virtual_interface_output_bits_persec | Bits transmitted per second by Direct Connect virtual interface | bits/s(IEC) |
dx_virtual_interface_input_packets_persec | Packets received per second by Direct Connect virtual interface | packets/s |
dx_virtual_interface_input_bits_persec | Bits received per second by Direct Connect virtual interface | bits/s(IEC) |