Skip to main content

One post tagged with "pitr"

View All Tags

Latest service updates for stronger operational reliability - Iceberg, PITR, SMS

· 4 min read
Mia (정혜원)
Technical Contents Manager
update

One of the most important values in cloud operations is stability. System stability is not only about preventing problems. Its reliability is determined by how quickly and flexibly problems can be recovered and resolved when they occur, and how well they can be prevented and prepared for in advance.

Through recent updates across several services, KakaoCloud has further strengthened this important value of Operational Reliability. We focused on improving users' operating experience around safe data recovery, efficient system maintenance, and fast failure notification systems.

In this post, we take a closer look at three notable improvements that can substantially improve operational reliability.


🧊 1. Iceberg format support for data integrity

One notable change in the recent update is that Data Catalog has officially started supporting the Apache Iceberg format. Apache Iceberg, developed by Netflix, is a powerful open-source table format designed for tracking change history in large-scale data (Time Travel) and restoring to specific points in time.

You can now select the Iceberg catalog type in KakaoCloud Data Catalog. With Iceberg added alongside the existing Hive Metastore-based Standard type, version management and point-in-time recovery have become much simpler even in large-scale data environments. Even if data loss or errors occur, you can easily restore to a previous state, and integration with major analytics engines such as Spark and Trino can be used immediately.

Through this update, KakaoCloud Data Catalog fully supports the integrity and resilience of large-scale data at a practical operational level, and is expected to further improve data reliability across analytics environments.

📝 Learn more about Apache Iceberg catalogs

⏪ 2. Stronger recovery reliability with point-in-time recovery (PITR)

Databases are one of the most important elements of cloud operational stability. Improving the reliability of recovery features in these database systems is truly important. In this MySQL update, the long-awaited Point-in-Time Recovery (PITR) feature has been added.

Based on automatic backups and Binary Logs, you can specify a desired point in time and restore a new instance group to the state at that time. Because you can now specify the recovery point down to the second, you can respond very flexibly to data loss caused by mistakes or errors.

💡 Please note! For service stability, point-in-time recovery currently supports a single availability configuration. If high availability (HA) configuration is required, we recommend adding instances after recovery is complete.

In addition, security groups can now be modified while instances are running, improving flexibility in network control. Account management procedures have also been improved so that password policies are applied in the same way when procedures are used. These detailed improvements to security and recovery features are important changes that substantially increase stability in real operating environments.

📝 Learn more about MySQL point-in-time recovery

📩 3. Notification speed is response speed

The outcome of responding to issues depends on how quickly operators recognize the system status. In this update, Maintenance introduced a new SMS notification feature in addition to existing email. When a maintenance task fails or an important event occurs, a notification is immediately sent to the registered mobile phone number. Now, even if you do not check email, you can recognize and respond to problem situations in real time.

💡 Please note! SMS notifications are sent only for events that require quick action, and project administrators must register valid contact information in advance.

📝 Learn more about Maintenance


These three updates were made in different services, but they all point in the same direction. Data can be restored safely without loss, security settings have become more flexible, and failures can be detected faster. This is the operational resilience KakaoCloud aims for. Stability improvements covering the entire operational process, from data to notifications, will continue.

KakaoCloud will continue improving technical completeness so that customers' operating environments become more stable and predictable. We appreciate your continued interest and support.

👉 Start KakaoCloud now