Skip to main content

2. Build a Kafka streaming environment with Pub/Sub messages

Receive Pub/Sub messages in Kafka and practice the first step of a streaming-based data pipeline.

Basic information

About this scenario

This tutorial configures a structure that streams messages from Pub/Sub to Kafka in a real-time pipeline. This flow becomes the foundation for storing and analyzing data in later steps.

In this tutorial, you configure a real-time data streaming environment that delivers messages generated through KakaoCloud Pub/Sub to Kafka. Kafka is a distributed message queue system suitable for large-scale message processing and serves as a starting point for various real-time analytics and storage flows.

You will cover the following:

  • Configure a Pub/Sub message producer
  • Connect a Kafka topic and configure a consumer
  • Receive messages and check logs

Before you start

This tutorial assumes that a Kafka cluster and topic have already been configured. If the Kafka environment is not ready, complete Message processing through Kafka first.

Step 1. Create a Pub/Sub topic and configure permissions

Create a Pub/Sub topic to integrate with Kafka and configure it to deliver messages to a Kafka topic. This step acts as the entry point that first receives messages in the overall streaming pipeline.

  1. Go to KakaoCloud console > Data Streaming > Pub/Sub > Topic.

  2. Click Create Topic, then enter the following.

    ItemValue
    Nametutorial-pub-topic
    DescriptionTopic for Kafka integration testing
  3. On the created topic details page, click the Subscription settings > Kafka integration tab.

  4. Click Add Kafka subscription, then enter the values below.

    ItemValue
    Kafka clustertutorial-amk-cluster
    Kafka Topictutorial-kafka-topic
  5. Save the settings and verify that the status changes to Active.

Step 2. Check the Kafka consumer

Run a Kafka consumer to verify that messages delivered from Pub/Sub arrive in Kafka correctly.

$ kafka-console-consumer.sh \
--bootstrap-server {KAFKA_BROKER_ENDPOINT} \
--topic tutorial-kafka-topic \
--from-beginning

Step 3. Publish a test message

Publish a test message to the Pub/Sub topic and verify that it is delivered to Kafka correctly.

  1. Go to KakaoCloud console > Pub/Sub > Topic > tutorial-pub-topic.

  2. Click Publish message and enter the JSON below.

    {
    "event": "log",
    "timestamp": "2025-04-21T10:00:00Z",
    "message": "Test message from Lab01"
    }
  3. Verify that the message is received by the Kafka consumer.

Wrap-up and next steps

You have completed the basic structure for streaming Pub/Sub messages to Kafka (tutorial-amk-cluster, tutorial-topic). This structure can serve as the foundation for the full data pipeline, including data storage (Object Storage), metadata management (Data Catalog), and analysis (Data Query).

For the next tutorial, see Load Kafka data into Object Storage.

Build a real-time data pipeline series

This series explains the entire process of building a real-time data pipeline centered on Kafka step by step. Message ingestion, storage, metadata registration, and analysis are connected into a single flow, and each step is written for a real operational environment.

Overall flow: Pub/Sub -> Kafka -> Object Storage -> Data Catalog -> Data Query

Message processing through Kafka
Configure Kafka streaming based on Pub/Sub messages
Load Kafka data into Object Storage
Register a Kafka message table using Data Catalog
⑤ Real-time data analysis based on Data Query