5 posts tagged with "instance"

Troubleshooting instances that cannot be accessed by SSH using OpenAPI

March 9, 2026 · 5 min read

Cloud Engineer

An SSH port configuration becoming tangled while changing the port on an operating server, a forgotten password after a long period without access, or a sudden file system error that prevents booting... These are alarming situations that any cloud operator may have experienced at least once.

When the newly configured port does not work and even the existing port 22 is closed, leaving only repeated Connection refused or Connection timeout messages, the instance becomes isolated: alive, but uncontrollable.

In such a frustrating situation where the instance is in the Active state but there is no way to enter it internally, this post introduces two methods for recovery using OpenAPI while minimizing the risk of data loss, based on the troubleshooting guides in KakaoCloud technical documentation.

💡 Method 1. Automatic recovery with a user script (user_data)

This method is especially useful when "software configuration" issues occur, such as an SSH port configuration error, an unregistered SELinux policy, or a forgotten SSH password. Instead of an in-place method that attempts to fix the problem inside the affected instance, it aims for an immutable-infrastructure-based replacement method that recreates resources with a script containing normal configuration.

📍 Recovery flow
Create an image of the existing instance → Write a recovery user script → Provision a new instance with the script injected

🩺 Detailed checks and recovery procedure

Step 1. Create a snapshot: Check the existing specifications with Get instance, then create an image of the current root volume state with Create image.
- Tip: We recommend stopping the instance before proceeding so that residual data in memory can be recorded safely.

Step 2. Write a recovery script: Write a user script (user_data) that restores the port to 22 or configures a new password/key pair. This script runs when the instance first boots, and must be Base64 encoded for the API request.

Step 3. Provision the instance: Call Create instance with the recovery script attached to the image created earlier. As soon as the instance is created, the injected script runs, correcting the blocked port configuration or immediately restoring account access.

The biggest advantage of this method is that even in an "isolated situation" where an operator cannot enter the instance, settings can be automatically corrected remotely from outside. By quickly replacing the failed instance with a verified environment instead of repairing it directly, recovery time objective (RTO) can be significantly shortened.

▶︎ Troubleshooting guide for restoring access after changing the SSH port

💡 Method 2. Directly inspect the root volume

File system corruption or network configuration file errors that cannot be resolved with a user script require a more direct approach. This is a kind of rescue mode strategy in which the affected volume is temporarily treated as a "sub disk" so an engineer can directly modify its contents.

📍 Recovery flow
Create a root volume snapshot → Attach to an inspection instance → Repair data and detach → Recover with a new instance

🩺 Detailed checks and recovery procedure

Step 1. Snapshot and restore the volume: To prevent damage to the original data, create a snapshot of the affected root volume and restore a new volume based on it. This secures a safe working environment.

Step 2. Attach the inspection volume: Designate another normally operating instance as the "rescue" instance, and attach the restored volume to that instance.

Step 3. Mount and repair data: Mount the volume on the inspection instance and directly fix the problem area. Key checks and actions include the following.

Network: Immediately fix typos or configuration errors in files under /etc/netplan or /etc/sysconfig/network-scripts.
File system: After unmounting, check and repair disk errors with commands such as xfs_repair or fsck. There may be various other causes depending on system logs and configuration environments, so detailed diagnosis is required.

Step 4. Create an image and provision: After solving the problem, detach the volume, then create a new image based on that volume. Finally, deploy a normalized new instance using this image to complete recovery.

The core of this method is to use the environment of a normal instance to directly fix the problematic parts, such as the file system and network settings, instead of forcibly recovering the failed instance. After all fixes are complete, the volume is converted back into an image and redeployed as a new instance with the defects resolved.

▶︎ Troubleshooting guide for instance recovery through root volume inspection

📝 Recovery golden rules operators should remember

The core of recovery that operators should learn in practice goes beyond simply using individual features. It is about structurally preparing a system-level recovery framework. Above all, by using a cloud-based flow that connects image creation, configuration correction, and redeployment, you can secure a recovery path even when access is blocked.

In this process, data protection is the basic premise. Making it a habit to stop the instance and create a snapshot before recovery work can minimize the risk of data loss. After recovery is complete, it is also advisable to clean up temporary snapshots, restored volumes, and existing instances to avoid unnecessary costs.

Failures occur without warning, but recovery procedures can be prepared in advance. By using KakaoCloud troubleshooting guides together with OpenAPI, you can secure reproducible recovery paths for most access failure situations. Refer to the technical documentation now and review automated recovery scenarios suitable for your infrastructure environment.

👉 Start KakaoCloud now

New BCS p1i and m3az instances released

December 4, 2023 · 2 min read

Mia (정혜원)

Technical Contents Manager

Notice

The following announcement for p1i and m3az instances was written based on information available in December 2023. For the latest information about KakaoCloud Beyond Compute Service, see Specifications by instance type.

Hello, we are sharing updates about the BCS (Beyond Compute Service) instance families.

KakaoCloud continues to introduce BCS instance types that better match users' diverse workload specifications. In this post, we introduce the recently released p1i family, which supports accelerated computing, and the general-purpose m3az family.

1. p1i instances for high-performance computing

p1i is an instance family optimized for high-performance computing such as machine learning and HPC. It is equipped with Gold 5120 Skylake Intel Xeon Scalable processors and supports up to 56 vCPUs and 512 GiB of memory. p1i instances are currently provided as Bare Metal Server types, with up to four NVIDIA V100 Tensor Core GPUs.

Selecting a p1i instance in the console Selecting a p1i instance in the console

2. m3az instances optimized for single-threaded CPU environments

The m3az instance family is a general-purpose instance family equipped with the latest 4th Gen AMD EPYC 9004 series processors. It provides a single-threaded CPU environment and is optimized for specific workloads such as games and healthcare. m3az instances provide memory and vCPU options in various sizes. They also provide network bandwidth up to 12.5 Gbps.

Selecting an m3az instance in the console Selecting an m3az instance in the console

The two new instance families, p1i and m3az, are available in the kr-central-2 region. For more information, see the BCS instance types documentation.

We hope you experience more efficient and powerful cloud computing with KakaoCloud's diverse BCS instance services.
Thank you.

BCS instance selection guide as of September 2023

September 22, 2023 · 3 min read

Romy (이새롬)

Technical Contents Manager

Notice

The following instance overview was written based on information available in September 2023. For the latest instance types and attributes of KakaoCloud Beyond Compute Service, see Specifications by instance type.

Hello, in this post we introduce the instances provided by KakaoCloud Beyond Compute Service (BCS).

An instance is composed of a combination of CPU, memory, storage, and networking capacity, and cloud service providers (CSPs) provide a wide range of instance choices based on users' business requirements, budgets, and constraints.

BCS is KakaoCloud's compute service and provides instance types optimized for a variety of usage environments. KakaoCloud BCS instances are broadly classified into five types.

General-purpose instances are suitable for diverse workloads such as general web servers, databases, and application servers.
Compute-optimized instances are suitable for compute-intensive workloads.
Memory-optimized instances are recommended for workloads that require large amounts of memory-intensive data.
Accelerated computing instances are high-performance computing services that use hardware such as GPUs and NPUs.

The following table compares BCS instance attributes at a glance as of September 2023.

Item	`m2a`	`t1i`	`c2a`	`r2a`	`gn1i`	`p2a`	`p2an`	`p2i`	`gf1i`	`vt1a`
Workload type	General purpose	General purpose	Compute optimized	Memory optimized	Accelerated computing	Accelerated computing	Accelerated computing	Accelerated computing	Accelerated computing	Video transcoding
CPU vendor	AMD	Intel	AMD	AMD	Intel	AMD-Xilinx	AMD	Intel	Intel	AMD
CPU name	EPYC 7643	Xeon Gold 5120 (Skylake), 5220 (Cascade Lake)	EPYC 7643	EPYC 7643	Xeon Gold 5220R (Cascade Lake)	EPYC 7513	EPYC 7763	Xeon Gold 6338 (Ice lake)	Xeon Gold 6430 (Sapphire Rapids)	EPYC 7643
Architecture	x86_64	x86_64	x86_64	x86_64	x86_64	x86_64	x86_64	x86_64	x86_64	x86_64
vCPU	2~96	2~8	2~96	2~96	4~64	128	256	128	18~72	16~128
Memory	8~384GiB	0.5~32GiB	4~192GiB	16~768GiB	16~256GiB	1536GiB	2048GiB	1024GiB	128~512GiB	48~384GiB
Bare metal option	X	X	X	O	X	O	O	O	X	X
Hardware-based encryption support	O	X	O	O	X	O	O	X	X	O
Storage NVMe support	O	X	X	O	X	O	X	O	X	X
Disk interface type	PCIe	PCIe	PCIe	PCIe, NVMe	PCIe	PCIe, NVMe	PCIe	PCIe, NVMe	PCIe	PCIe
Local SSD support (bare metal)	X	X	X	O	X	O	O	O	X	X
Maximum local SSD (bare metal)	—	—	—	1600GiB	—	3200GiB	2080GiB	3200GiB	—	—
Network performance	~25Gbps	~5Gbps	~25Gbps	~25Gbps	~50Gbps	~50Gbps	~100Gbps	~50Gbps	~50Gbps	~50Gbps
HW type	—	—	—	—	GPU	GPU	GPU	GPU	NPU	FPGA
Maximum HW count	—	—	—	—	4	8	8	4	4	8
HW vendor	—	—	—	—	NVIDIA	NVIDIA	NVIDIA	NVIDIA	FuriosaAI	AMD-Xilinx
HW name	—	—	—	—	T4 Tensor Core	A100 Tensor Core	A100 Tensor Core	A100 Tensor Core	Warboy	AlveoU30

After identifying the suitable BCS instance type, learn how to create and manage instances.
You can find more usage examples in hands-on tutorials.
For more information about BCS instances, see Instance overview.

We will continue working to provide safer and more convenient cloud services.

Thank you.

BCS general-purpose t1i instances released

September 15, 2023 · 3 min read

Mia (정혜원)

Technical Contents Manager

Notice

The following announcement for t1i instances was written based on information available in September 2023. For the latest instance types and attributes of KakaoCloud Beyond Compute Service, see General-purpose instances.

Hello, KakaoCloud's general-purpose instances are now available in the kr-central-2 region.

t1i instances run on 2nd Gen Intel Xeon Scalable processors (Cascade Lake 5220 or Skylake 5120) with frequencies up to 3.9 GHz, and provide the most cost-effective option among KakaoCloud instance families.

t1i instances provide a balanced mix of compute, memory, and network resources, and are designed for general-purpose workloads that maintain low average CPU usage but experience temporary spikes. They provide up to 8 vCPUs and 32 GiB of memory, and support network bandwidth of up to 5 Gbps.

t1i instances start from as low as KRW 5.5 per hour for t1i.nano, and before burst is provided, CPU-credit-based utilization is additionally relaxed by up to 15 to 20%.

Detailed specifications and pricing for t1i instances are as follows.

2nd Gen Intel Xeon Scalable processors (Cascade Lake 5220 or Skylake 5120) up to 3.9 GHz
Network bandwidth up to 5 Gbps
Instance sizes supporting up to 8 vCPUs and 32 GiB of memory
Support for Intel instruction sets (AVX, AVX2, AVX-512)
Support for Intel Turbo Boost Technology 2.0
Burstable CPU controlled by CPU credits and consistent baseline performance (planned for future availability)

Instance type	vCPU	Memory (GiB)	Network bandwidth (Gbps)	Hourly price	Monthly price (based on 30 days)
`t1i.nano`	2	0.5	Up to 5	KRW 5.5	KRW 3,960
`t1i.micro`	2	1	Up to 5	KRW 11.1	KRW 7,992
`t1i.small`	2	2	Up to 5	KRW 22.1	KRW 15,912
`t1i.medium`	2	4	Up to 5	KRW 44.2	KRW 31,824
`t1i.large`	2	8	Up to 5	KRW 88.4	KRW 63,648
`t1i.xlarge`	4	16	Up to 5	KRW 176.8	KRW 127,296
`t1i.2xlarge`	8	32	Up to 5	KRW 353.6	KRW 254,592

For more information, see the General-purpose instances page.

We will continue working to provide safer and more convenient cloud services.

Thank you.

BCS compute-optimized c2a instances released

September 12, 2023 · 2 min read

Sandy (차신영)

Technical Contents Manager

Notice

The following announcement for c2a instances was written based on information available in September 2023. For the latest instance types and attributes of KakaoCloud Beyond Compute Service, see Compute-optimized instances.

Hello, we are sharing the release of KakaoCloud's compute-optimized instances.

The newly released compute-optimized instance family from KakaoCloud is the c2a family, equipped with 3rd Gen AMD EPYC 7003 series 7643 processors that run at frequencies up to 3.6 GHz.

c2a instances are designed for compute-intensive workloads that require high-performance compute specifications, with a 1:2 vCPU-to-memory ratio. They provide up to 96 vCPUs and up to 192 GB of memory, and support high network bandwidth of up to 25 Gbps.

Compared with existing general-purpose instances, c2a instances can be a cost-effective option for compute-intensive workloads. They can be used broadly for latency-sensitive workloads, batch processing workloads, media transcoding, high-performance web servers, high-performance computing (HPC), scientific modeling, dedicated game server and ad server engines, and many other applications that require high-performance compute capabilities.

Detailed specifications and pricing for c2a instances are as follows.

Processor: 3rd Gen AMD EPYC 7003 series 7643
Frequency: Up to 3.6 GHz
vCPU count: Up to 96
Memory: Up to 192 GB
Network bandwidth: Up to 25 Gbps

Instance type	vCPU	Memory (GiB)	Network bandwidth (Gbps)	Hourly price	Monthly price (based on 30 days)
`c2a.large`	2	4	Up to 10	KRW 82	KRW 59,040
`c2a.xlarge`	4	8	Up to 10	KRW 164	KRW 118,080
`c2a.2xlarge`	8	16	Up to 10	KRW 327	KRW 235,440
`c2a.4xlarge`	16	32	Up to 10	KRW 655	KRW 471,600
`c2a.8xlarge`	32	64	Up to 10	KRW 1,309	KRW 942,480
`c2a.12xlarge`	48	96	12.5	KRW 1,964	KRW 1,414,080
`c2a.16xlarge`	64	128	12.5	KRW 2,618	KRW 1,884,960
`c2a.24xlarge`	96	192	25	KRW 3,928	KRW 2,828,160

For more information, see the Compute-optimized instances page.

We will continue working to provide safer and more convenient cloud services.

Thank you.

💡 Method 1. Automatic recovery with a user script (user_data)​

💡 Method 2. Directly inspect the root volume​

📝 Recovery golden rules operators should remember​

💡 Method 1. Automatic recovery with a user script (user_data)

💡 Method 2. Directly inspect the root volume

📝 Recovery golden rules operators should remember