2. Build a Kafka streaming environment with Pub/Sub messages
Receive Pub/Sub messages in Kafka and practice the first step of a streaming-based data pipeline.
- Estimated time: 30 minutes
- Recommended OS: macOS, Ubuntu
- Prerequisites
About this scenario
This tutorial configures a structure that streams messages from Pub/Sub to Kafka in a real-time pipeline. This flow becomes the foundation for storing and analyzing data in later steps.
In this tutorial, you configure a real-time data streaming environment that delivers messages generated through KakaoCloud Pub/Sub to Kafka. Kafka is a distributed message queue system suitable for large-scale message processing and serves as a starting point for various real-time analytics and storage flows.
You will cover the following:
- Configure a Pub/Sub message producer
- Connect a Kafka topic and configure a consumer
- Receive messages and check logs
Before you start
This tutorial assumes that a Kafka cluster and topic have already been configured. If the Kafka environment is not ready, complete Message processing through Kafka first.
Step 1. Create a Pub/Sub topic and configure permissions
Create a Pub/Sub topic to integrate with Kafka and configure it to deliver messages to a Kafka topic. This step acts as the entry point that first receives messages in the overall streaming pipeline.
-
Go to KakaoCloud console > Data Streaming > Pub/Sub > Topic.
-
Click Create Topic, then enter the following.
Item Value Name tutorial-pub-topicDescription Topic for Kafka integration testing -
On the created topic details page, click the Subscription settings > Kafka integration tab.
-
Click Add Kafka subscription, then enter the values below.
Item Value Kafka cluster tutorial-amk-clusterKafka Topic tutorial-kafka-topic -
Save the settings and verify that the status changes to
Active.
Step 2. Check the Kafka consumer
Run a Kafka consumer to verify that messages delivered from Pub/Sub arrive in Kafka correctly.
$ kafka-console-consumer.sh \
--bootstrap-server {KAFKA_BROKER_ENDPOINT} \
--topic tutorial-kafka-topic \
--from-beginning
Step 3. Publish a test message
Publish a test message to the Pub/Sub topic and verify that it is delivered to Kafka correctly.
-
Go to KakaoCloud console > Pub/Sub > Topic > tutorial-pub-topic.
-
Click Publish message and enter the JSON below.
{
"event": "log",
"timestamp": "2025-04-21T10:00:00Z",
"message": "Test message from Lab01"
} -
Verify that the message is received by the Kafka consumer.
Wrap-up and next steps
You have completed the basic structure for streaming Pub/Sub messages to Kafka (tutorial-amk-cluster, tutorial-topic).
This structure can serve as the foundation for the full data pipeline, including data storage (Object Storage), metadata management (Data Catalog), and analysis (Data Query).
For the next tutorial, see Load Kafka data into Object Storage.
This series explains the entire process of building a real-time data pipeline centered on Kafka step by step. Message ingestion, storage, metadata registration, and analysis are connected into a single flow, and each step is written for a real operational environment.
Overall flow: Pub/Sub -> Kafka -> Object Storage -> Data Catalog -> Data Query
① Message processing through Kafka
② Configure Kafka streaming based on Pub/Sub messages
③ Load Kafka data into Object Storage
④ Register a Kafka message table using Data Catalog
⑤ Real-time data analysis based on Data Query