Query DNS resolver query logs with Data Query
Learn how to query DNS resolver query logs stored in Object Storage using Data Catalog and Data Query with SQL.
- Estimated time: 30 minutes
- Prerequisites
- Create an Object Storage bucket to store DNS resolver query logs
About this scenario
This tutorial walks you through storing DNS resolver query logs from a VPC in Object Storage, then querying those logs with SQL using Data Catalog and Data Query.
Key topics covered in this tutorial:
- Configure DNS resolver query logs to be stored in an Object Storage bucket
- Create a database in Data Catalog connected to the log bucket
- Create a table, synchronize partitions, and query log data in Data Query
- Appendix: Example queries
Before you start
To query DNS resolver query logs with Data Query, you must first create an Object Storage bucket. Prepare a bucket by referring to Create an Object Storage bucket, then proceed with the steps below.
Getting started
Step 1. Enable DNS resolver query log storage
Configure query logging for the DNS resolver in your VPC by referring to Configure DNS resolver query logging.
Step 2. Configure Data Catalog
In Data Catalog, create a catalog and a database connected to the bucket where logs are stored.
-
Go to KakaoCloud console > Analytics > Data Catalog.
-
In the Catalog menu, click Create catalog.
-
In the Create catalog pop-up, enter the following information, then click Create.
Category Value Type Standard catalog Name Catalog name, for example catalogVPC settings Select the network for the catalog. Only one catalog can be created per VPC-subnet. -
Click the catalog you created, such as
catalog, then click Create database in the database list. -
In the Create database pop-up, enter the following information, then click Create.
Category Value Type Standard database Catalog Select the catalogcreated aboveName Database name, for example dns_queryPath S3 connection: selected / Bucket name: enter the Object Storage bucket name / Directory: KCLogs/kr-central-2Properties (optional) - Description (optional) - NoteDNS resolver query logs are stored in Object Storage in the specified bucket using the following format.
{bucket-name}/KCLogs/{region-name}/{year=yyyy/month=mm/day=dd}/{az-name}_{project-id}_{vpc-id}_{logger-id}_{start-time}_{end-time}.log.gz
Step 3. Query data with Data Query
Create a table through the database created in Data Catalog, synchronize partitions, and query log data.
-
Go to KakaoCloud console > Analytics > Data Query > Query Editor.
-
Select the data source and database as follows.
Category Value Data source Select the Data Catalog catalog catalogDatabase Select the Data Catalog database dns_query -
Create a table in the query editor on the right.
Create tableCREATE TABLE ${CATALOG_NAME}.${DATABASE_NAME}.${TABLE_NAME} (
availability_zone varchar,
project_id varchar,
query_class varchar,
query_name varchar,
record_type varchar,
region varchar,
response_code varchar,
response_count varchar,
response_protocol varchar,
response_size varchar,
response_time varchar,
timestamp varchar,
version varchar,
vpc_id varchar,
year varchar,
month varchar,
day varchar
)
WITH (
format = 'JSON',
external_location = 's3a://${BUCKET_NAME}/KCLogs/kr-central-2', -- Only the kr-central-2 region is currently supported.
partitioned_by = ARRAY['year', 'month', 'day']
);환경변수 설명 CATALOG_NAME🖌︎ Catalog name created in Data Catalog DATABASE_NAME🖌︎ Database name created in Data Catalog TABLE_NAME🖌︎ Table name BUCKET_NAME🖌︎ Bucket name -
Open a new query tab and run the partition synchronization query. If you run it in the same tab as the table creation query, partitions may not be recognized correctly.
Synchronize partitionsCALL system.sync_partition_metadata(
schema_name => '${DATABASE_NAME}',
table_name => '${TABLE_NAME}',
mode => 'FULL',
case_sensitive => true
);환경변수 설명 DATABASE_NAME🖌︎ Database name created in Data Catalog TABLE_NAME🖌︎ Table name -
After partition synchronization succeeds, query the partitions.
Query partitionsSELECT *
FROM ${CATALOG_NAME}.${DATABASE_NAME}."${TABLE_NAME}$partitions";환경변수 설명 CATALOG_NAME🖌︎ Catalog name created in Data Catalog DATABASE_NAME🖌︎ Database name created in Data Catalog TABLE_NAME🖌︎ Table name If the query returns partitions by year, month, and day, the configuration is working correctly.
Appendix
Appendix 1. Example queries
SELECT *
FROM ${CATALOG_NAME}.${DATABASE_NAME}.${TABLE_NAME}
WHERE year = '${YEAR}' AND month = '${MONTH}' AND day = '${DAY}';
| 환경변수 | 설명 |
|---|---|
| CATALOG_NAME🖌︎ | Catalog name created in Data Catalog |
| DATABASE_NAME🖌︎ | Database name created in Data Catalog |
| TABLE_NAME🖌︎ | Table name |
| YEAR🖌︎ | Year, for example 2026 |
| MONTH🖌︎ | Month, for example 04 |
| DAY🖌︎ | Day, for example 30 |
SELECT query_name, COUNT(*) AS query_count
FROM ${CATALOG_NAME}.${DATABASE_NAME}.${TABLE_NAME}
WHERE year = '${YEAR}' AND month = '${MONTH}' AND day = '${DAY}'
GROUP BY query_name
ORDER BY query_count DESC
LIMIT 10;
| 환경변수 | 설명 |
|---|---|
| CATALOG_NAME🖌︎ | Catalog name created in Data Catalog |
| DATABASE_NAME🖌︎ | Database name created in Data Catalog |
| TABLE_NAME🖌︎ | Table name |
| YEAR🖌︎ | Year, for example 2026 |
| MONTH🖌︎ | Month, for example 04 |
| DAY🖌︎ | Day, for example 30 |
SELECT
SUBSTR("timestamp", 12, 2) AS hour,
COUNT(*) AS failed_query_count
FROM ${CATALOG_NAME}.${DATABASE_NAME}.${TABLE_NAME}
WHERE year = '${YEAR}'
AND month = '${MONTH}'
AND day = '${DAY}'
AND response_code != 'NOERROR'
GROUP BY SUBSTR("timestamp", 12, 2)
ORDER BY failed_query_count DESC
LIMIT 10;
| 환경변수 | 설명 |
|---|---|
| CATALOG_NAME🖌︎ | Catalog name created in Data Catalog |
| DATABASE_NAME🖌︎ | Database name created in Data Catalog |
| TABLE_NAME🖌︎ | Table name |
| YEAR🖌︎ | Year, for example 2026 |
| MONTH🖌︎ | Month, for example 04 |
| DAY🖌︎ | Day, for example 30 |
SELECT
query_name,
COUNT(*) AS failed_count
FROM ${CATALOG_NAME}.${DATABASE_NAME}.${TABLE_NAME}
WHERE year = '${YEAR}'
AND month = '${MONTH}'
AND day = '${DAY}'
AND response_code != 'NOERROR'
GROUP BY query_name
ORDER BY failed_count DESC
LIMIT 5;
| 환경변수 | 설명 |
|---|---|
| CATALOG_NAME🖌︎ | Catalog name created in Data Catalog |
| DATABASE_NAME🖌︎ | Database name created in Data Catalog |
| TABLE_NAME🖌︎ | Table name |
| YEAR🖌︎ | Year, for example 2026 |
| MONTH🖌︎ | Month, for example 04 |
| DAY🖌︎ | Day, for example 30 |
SELECT
query_name,
response_time,
response_code,
response_size,
response_protocol
FROM ${CATALOG_NAME}.${DATABASE_NAME}.${TABLE_NAME}
WHERE year = '${YEAR}'
AND month = '${MONTH}'
AND day = '${DAY}'
AND regexp_like(response_time, '^[0-9]+(\.[0-9]+)?$')
ORDER BY CAST(response_time AS DOUBLE) DESC
LIMIT 10;
| 환경변수 | 설명 |
|---|---|
| CATALOG_NAME🖌︎ | Catalog name created in Data Catalog |
| DATABASE_NAME🖌︎ | Database name created in Data Catalog |
| TABLE_NAME🖌︎ | Table name |
| YEAR🖌︎ | Year, for example 2026 |
| MONTH🖌︎ | Month, for example 04 |
| DAY🖌︎ | Day, for example 30 |
-- HAVING COUNT(*) > 1000 is a reference threshold. Adjust it for your environment.
SELECT
vpc_id,
query_name,
COUNT(*) AS query_count
FROM ${CATALOG_NAME}.${DATABASE_NAME}.${TABLE_NAME}
WHERE year = '${YEAR}'
AND month = '${MONTH}'
AND day = '${DAY}'
GROUP BY vpc_id, query_name
HAVING COUNT(*) > 1000
ORDER BY query_count DESC
LIMIT 10;
| 환경변수 | 설명 |
|---|---|
| CATALOG_NAME🖌︎ | Catalog name created in Data Catalog |
| DATABASE_NAME🖌︎ | Database name created in Data Catalog |
| TABLE_NAME🖌︎ | Table name |
| YEAR🖌︎ | Year, for example 2026 |
| MONTH🖌︎ | Month, for example 04 |
| DAY🖌︎ | Day, for example 30 |
SELECT
SUBSTR("timestamp", 12, 2) AS hour,
COUNT(*) AS query_count
FROM ${CATALOG_NAME}.${DATABASE_NAME}.${TABLE_NAME}
WHERE year = '${YEAR}'
AND month = '${MONTH}'
AND day = '${DAY}'
GROUP BY SUBSTR("timestamp", 12, 2)
ORDER BY hour;
| 환경변수 | 설명 |
|---|---|
| CATALOG_NAME🖌︎ | Catalog name created in Data Catalog |
| DATABASE_NAME🖌︎ | Database name created in Data Catalog |
| TABLE_NAME🖌︎ | Table name |
| YEAR🖌︎ | Year, for example 2026 |
| MONTH🖌︎ | Month, for example 04 |
| DAY🖌︎ | Day, for example 30 |
SELECT
day,
COUNT(*) AS query_count
FROM ${CATALOG_NAME}.${DATABASE_NAME}.${TABLE_NAME}
WHERE year = '${YEAR}'
AND month = '${MONTH}'
GROUP BY day
ORDER BY day;
| 환경변수 | 설명 |
|---|---|
| CATALOG_NAME🖌︎ | Catalog name created in Data Catalog |
| DATABASE_NAME🖌︎ | Database name created in Data Catalog |
| TABLE_NAME🖌︎ | Table name |
| YEAR🖌︎ | Year, for example 2026 |
| MONTH🖌︎ | Month, for example 04 |
SELECT
response_code,
COUNT(*) AS request_count
FROM ${CATALOG_NAME}.${DATABASE_NAME}.${TABLE_NAME}
WHERE year = '${YEAR}'
AND month = '${MONTH}'
AND day = '${DAY}'
GROUP BY response_code
ORDER BY request_count DESC;
| 환경변수 | 설명 |
|---|---|
| CATALOG_NAME🖌︎ | Catalog name created in Data Catalog |
| DATABASE_NAME🖌︎ | Database name created in Data Catalog |
| TABLE_NAME🖌︎ | Table name |
| YEAR🖌︎ | Year, for example 2026 |
| MONTH🖌︎ | Month, for example 04 |
| DAY🖌︎ | Day, for example 30 |
-- Aggregate suspicious queries that include DROP, DELETE, or UNION in query_name.
SELECT
query_name,
COUNT(*) AS occurrence
FROM ${CATALOG_NAME}.${DATABASE_NAME}.${TABLE_NAME}
WHERE year = '${YEAR}'
AND month = '${MONTH}'
AND day = '${DAY}'
AND (query_name LIKE '%DROP%' OR query_name LIKE '%DELETE%' OR query_name LIKE '%UNION%')
GROUP BY query_name
ORDER BY occurrence DESC;
-- Check TXT records with unusually long query_name values. The 100-character threshold is a reference value. Adjust it for your environment.
SELECT
query_name,
response_size,
response_code,
LENGTH(query_name) AS query_length
FROM ${CATALOG_NAME}.${DATABASE_NAME}.${TABLE_NAME}
WHERE year = '${YEAR}'
AND month = '${MONTH}'
AND day = '${DAY}'
AND record_type = 'TXT'
AND LENGTH(query_name) > 100
ORDER BY query_length DESC
LIMIT 20;
| 환경변수 | 설명 |
|---|---|
| CATALOG_NAME🖌︎ | Catalog name created in Data Catalog |
| DATABASE_NAME🖌︎ | Database name created in Data Catalog |
| TABLE_NAME🖌︎ | Table name |
| YEAR🖌︎ | Year, for example 2026 |
| MONTH🖌︎ | Month, for example 04 |
| DAY🖌︎ | Day, for example 30 |
SELECT
record_type,
COUNT(*) AS query_count
FROM ${CATALOG_NAME}.${DATABASE_NAME}.${TABLE_NAME}
WHERE year = '${YEAR}'
AND month = '${MONTH}'
AND day = '${DAY}'
GROUP BY record_type
ORDER BY query_count DESC;
| 환경변수 | 설명 |
|---|---|
| CATALOG_NAME🖌︎ | Catalog name created in Data Catalog |
| DATABASE_NAME🖌︎ | Database name created in Data Catalog |
| TABLE_NAME🖌︎ | Table name |
| YEAR🖌︎ | Year, for example 2026 |
| MONTH🖌︎ | Month, for example 04 |
| DAY🖌︎ | Day, for example 30 |