Skip to main content

Key Concepts

KakaoCloud’s Monitoring service supports rapid issue detection and response by monitoring the status of computing resources and providing notifications when events occur. Users can monitor key resources in real time from the dashboard in a web environment and configure a systematic monitoring system by setting metric and log policies. The Monitoring service allows for flexible and efficient resource management, minimizing the resources needed for administration.

info

On August 27, 2024, service name, service type, and metric name for Redis were changed to MemStore within the Monitoring console.
Monitoring, Explorer, and MetricExport functions for Redis will be available only until September 27. For details, refer to the announcement.

Monitoring service system architecture

The Monitoring service is designed for users to configure multiple policies necessary for resource operations and management, enabling the collection of specific data as needed. In case of failure, the monitoring history can be checked through notifications, allowing for quick issue identification.

image Monitoring service architecture

Key Concepts

Dashboard

The Monitoring service dashboard provides real-time monitoring of key resources. The types of dashboards available are as follows:

TypeDescription
Default dashboardThe default dashboard provided by KakaoCloud allows users to view metrics for in-use resources without additional configuration
- Users cannot modify the default dashboard and can only view the provided metrics
Custom dashboardA user-created dashboard where desired service metric charts can be added for management
- See Monitoring Metrics for supported metrics
info

The KakaoCloud monitoring agent must be installed to view metrics.
For installation instructions, refer to Install Agent.

Monitoring supported services

CategoryService details
Monitoring supported services   - Beyond Compute Service
 ᄂ Virtual Machine
 ᄂ Bare Metal Server
 ᄂ GPU
- Kubernetes Engine (default)
- MySQL
- MemStore
- Load Balancing

Monitoring metrics

Key BCS metrics

info

Metrics such as mem_buffered, mem_cached, and disk_inodes_usage are only collected and available on servers with Linux OS installed.
The nvidia_smi metric is only collected on servers with a GPU installed.

info

For GPU instance NVIDIA library updates, please check compatibility between the library version and CUDA version.
If the versions are incompatible due to updates through apt upgrade or similar,
metrics related to Nvidia may not be collected by the monitoring agent installed by the user.

Metric nameDescriptionUnit
cpu_usageMeasures total CPU usage%
cpu_usage_iowaitCPU usage percentage in iowait state%
cpu_usage_systemCPU usage percentage in system state%
cpu_usage_userCPU usage percentage in user state%
cpu_usage_per_coreMeasures CPU usage per core%
mem_bufferedMemory usage in buffered statebytes(IEC)
mem_cachedMemory usage in cached statebytes(IEC)
mem_usedMemory usagebytes(IEC)
mem_usageMemory usage percentage%
disk_usedDisk usagebytes(IEC)
disk_used_percentDisk usage percentage%
disk_inodes_usageDisk inode usage percentage%
disk_read_bytes_persecBytes read per second from diskbytes/s(IEC)
disk_write_bytes_persecBytes written per second to diskbytes/s(IEC)
disk_read_iopsCompleted input operations per second on diskcount/s
disk_write_iopsCompleted output operations per second on diskcount/s
network_rx_bytes_persecBytes received per second on network interfacebytes/s(IEC)
network_tx_bytes_persecBytes sent per second on network interfacebytes/s(IEC)
network_rx_packets_persecPackets received per second on network interfacepackets/s
network_tx_packets_persecPackets sent per second on network interfacepackets/s
nvidia_smi_memory_freeFree memory per GPU coreMiB(IEC)
nvidia_smi_memory_totalTotal memory per GPU coreMiB(IEC)
nvidia_smi_memory_usedUsed memory per GPU coreMiB(IEC)
nvidia_smi_power_drawPower consumption per GPU corewatt
nvidia_smi_utilization_gpuGPU core utilization rate%

Key MemStore metrics

Metric nameDescriptionUnit
memstore_allocator_rss_bytesRSS memory sizebytes(IEC)
memstore_clientsNumber of connected connectionscount
memstore_connected_slavesNumber of connected replicascount
memstore_evicted_keysNumber of keys removed due to maxmemory limitcount
memstore_expired_keysNumber of expired keyscount
memstore_instantaneous_ops_per_secCommands processed per secondcount
memstore_client_ratioRatio of current clients to max clients%
memstore_memory_usageMemory usage by MemStore instance%
memstore_keyspace_hitsNumber of key hitscount
memstore_keyspace_missesNumber of key missescount
memstore_maxclientsMaximum number of connections allowedcount
memstore_maxmemoryMaximum memory availablebytes(IEC)
memstore_replication_lagReplication lag times
memstore_uptimeUptimes
memstore_used_memoryMemory used by MemStorebytes(IEC)
memstore_cmdstat_calls_persecCommand calls per secondcount/s
memstore_keyspace_hitrate_percentKey hit rate%
memstore_lru_clockLRU (Least Recently Used) clock for tracking elapsed timecount
memstore_blocked_clientsNumber of clients waiting on BLPOP, BRPOP, BRPOPLPUSH, BLMOVE, BZPOPMIN, BZPOPMAX commandscount
memstore_cluster_connectionsEstimated number of sockets used by cluster buscount
memstore_allocator_activeActive memory managed by allocator, including external fragmentationbytes(IEC)
memstore_allocator_allocatedAllocated memory in allocator, including internal fragmentationbytes(IEC)
memstore_allocator_residentResident memory managed by allocator, including reclaimable memorybytes(IEC)
memstore_allocator_frag_bytesDifference between active and allocated memory in allocatorbytes(IEC)
memstore_allocator_frag_ratioRatio of active to allocated memory in allocator%
memstore_allocator_rss_ratioRatio of resident to active memory in allocator%
memstore_lazyfree_pending_objectsNumber of objects waiting to be freed by UNLINK or ASYNC optionscount
memstore_lazyfreed_objectsNumber of objects freed by lazy free processcount
memstore_mem_fragmentation_bytesDifference between resident and allocated memory in MemStorebytes(IEC)
memstore_mem_fragmentation_ratioRatio of resident to allocated memory in MemStore%
memstore_mem_not_counted_for_evictMemory excluded from eviction calculationsbytes(IEC)
memstore_rss_overhead_bytesDifference between resident memory of MemStore and allocatorbytes(IEC)
memstore_rss_overhead_ratioRatio of resident memory of MemStore to allocator%
memstore_total_system_memorySystem memory available for MemStorebytes(IEC)
memstore_used_memory_datasetMemory used for actual data storage, excluding overheadbytes(IEC)
memstore_used_memory_dataset_percPercentage of memory used for data storage, excluding overhead%
memstore_used_memory_luaMemory used by Lua engine for script executionbytes(IEC)
memstore_used_memory_overheadAll overhead memory required to manage internal data structuresbytes(IEC)
memstore_used_memory_peakPeak memory used by MemStorebytes(IEC)
memstore_used_memory_peak_percPeak memory usage percentage%
memstore_used_memory_rssResident memory allocated by OSbytes(IEC)
memstore_instantaneous_input_kbpsNetwork input rate in KiB/sKiB/s(IEC)
memstore_instantaneous_output_kbpsNetwork output rate in KiB/sKiB/s(IEC)
memstore_io_threaded_reads_processedTotal read events processed by main and I/O threadscount
memstore_io_threaded_writes_processedTotal write events processed by main and I/O threadscount
memstore_pubsub_channelsNumber of pub/sub channels with subscriptionscount
memstore_pubsub_patternsNumber of pub/sub patterns with subscriptionscount
memstore_total_commands_processedTotal number of commands processed by servercount
memstore_total_connections_receivedTotal number of connections accepted by servercount
memstore_total_error_repliesTotal number of error responsescount
memstore_total_net_input_bytesTotal network input bytesbytes(IEC)
memstore_total_net_output_bytesTotal network output bytesbytes(IEC)
memstore_total_reads_processedTotal number of read events processedcount
memstore_total_writes_processedTotal number of write events processedcount
memstore_used_cpu_sysSystem CPU used by all threads in server processcount
memstore_used_cpu_sys_main_threadSystem CPU used by main threadcount
memstore_used_cpu_userUser CPU used by all threads in user processcount
memstore_used_cpu_user_main_threadUser CPU used by main threadcount
memstore_cluster_enabledCluster enabled statuscount

Key MySQL metrics

Metric nameDescriptionUnit
mem_swap_totalTotal swap memorybytes(IEC)
mem_swap_cachedCached swap memorybytes(IEC)
mem_swap_freeFree swap memorybytes(IEC)
mysql_logstorage_disk_write_bytes_persecBytes written per second to log storage diskbytes/s(IEC)
mysql_defaultstorage_disk_write_bytes_persecBytes written per second to default storage diskbytes/s(IEC)
mysql_logstorage_disk_read_bytes_persecBytes read per second from log storage diskbytes/s(IEC)
mysql_defaultstorage_disk_read_bytes_persecBytes read per second from default storage diskbytes/s(IEC)
mysql_logstorage_disk_write_iopsWrite operations per second on log storage diskcount/s
mysql_defaultstorage_disk_write_iopsWrite operations per second on default storage diskcount/s
mysql_logstorage_disk_read_iopsRead operations per second on log storage diskcount/s
mysql_defaultstorage_disk_read_iopsRead operations per second on default storage diskcount/s
mysql_logstorage_disk_usedLog storage disk usagebytes(IEC)
mysql_defaultstorage_disk_usedDefault storage disk usagebytes(IEC)
mysql_defaultstorage_disk_used_percentDefault storage disk usage percentage%
mysql_logstorage_disk_used_percentLog storage disk usage percentage%
mysql_logstorage_disk_inodes_usageLog storage inode usage percentage%
mysql_defaultstorage_disk_inodes_usageDefault storage inode usage percentage%
mysql_network_rx_bytes_persecBytes received per second on network interfacebytes/s(IEC)
mysql_network_tx_bytes_persecBytes sent per second on network interfacebytes/s(IEC)
mysql_innodb_row_lock_current_waitsNumber of current row lockscount
mysql_binary_size_bytesBinary log sizebytes(IEC)
mysql_binary_files_countNumber of binary log filescount
mysql_variables_max_binlog_sizeMaximum binary log sizebytes(IEC)
mysql_connections_countNumber of connectionscount
mysql_slow_query_countNumber of slow queries in last 5 minutescount
mysql_com_insert_countNumber of INSERT queries in last 5 minutescount
mysql_com_select_countNumber of SELECT queries in last 5 minutescount
mysql_com_delete_countNumber of DELETE queries in last 5 minutescount
mysql_com_commit_countNumber of COMMIT queries in last 5 minutescount
mysql_com_update_countNumber of UPDATE queries in last 5 minutescount
mysql_query_persecQueries per second (QPS)count/s
mysql_connection_usage_percentRatio of connected to max connections%
mysql_innodb_buffer_pool_read_requestsTotal buffer pool read requestscount
mysql_innodb_row_lock_timeRow lock timemilliseconds
mysql_innodb_buffer_pool_readsBuffer pool read requestscount
mysql_innodb_buffer_cache_hit_ratioInnoDB buffer pool cache hit ratio%
mysql_uptimeUptimeduration
mysql_instance_statusInstance statuscount
mysql_instance_group_statusInstance group statuscount
mysql_replication_lagBinlog replication lagseconds
mysql_max_connections_countMaximum number of connections allowedcount

Key Load Balancing metrics

Metric nameDescriptionUnit
lb_bytes_in_persecInbound trafficbytes/s(IEC)
lb_bytes_out_persecOutbound trafficbytes/s(IEC)
lb_connections_persecConnections per secondcount/s
lb_current_connectionsCurrent connectionscount