Skip to main content

Create and Manage Crawlers

Create crawler

In Data Catalog, a crawler scans MySQL data, extracts metadata, and automatically updates the Data Catalog to make data discovery easier. To create a crawler, you must first have a database and MySQL created.

info

For instructions on creating a database and MySQL, see:

To create a crawler in the Data Catalog service:

  1. Go to the KakaoCloud console > Analytics > Data Catalog.
  2. In Crawlers, click Create crawler.
  3. In Create crawler, enter the required information and click Create.
    FieldDescription
    DatabaseThe name of the database to which tables will be added.
    - Only databases in the ACTIVE state are listed.
    - After selecting a database, you can check the network/subnet information.
    - Crawlers are not supported for Iceberg-type catalogs.
    Crawler nameName of the crawler
    MySQL full pathSelect the MySQL to connect to and enter the database name of that MySQL.
    - Only MySQL instances in the AVAILABLE state are listed.
    MySQL accountEnter the username and password configured when creating MySQL.
    - Connection test: After entering the MySQL full path and account information, click Test to verify the connection.
    * If the connection test does not complete successfully, you cannot create the crawler.
    Description (optional)Additional description for the crawler
    Table prefix (optional)A prefix added to created table names. Tables are created as Prefix + MySQL database name_table name.
    - Allowed characters: lowercase letters, digits, and underscore (_) only (1–64 chars)
    ScheduleManage when the crawler runs.
    - For on-demand, it runs only when triggered manually and has no schedule.
info

Only resources whose state is normal (ACTIVE / AVAILABLE) are listed for databases and MySQL.

Manage crawlers

This section explains how to manage crawlers in the Data Catalog service.

View crawler list

View the list of crawlers currently in use.

  1. Go to the KakaoCloud console > Analytics > Data Catalog.

  2. Click Crawlers to see the list.

    ColumnDescription
    NameThe crawler name entered at creation.
    - Click the crawler name to open its Details tab.
    DescriptionDescription entered at creation
    StatusCrawler status
    ScheduleThe schedule on which the crawler runs
    Last run statusStatus of the most recent run
    Last run timeTimestamp when the crawler most recently ran
    [More] icon- Edit: Modify the crawler description and schedule
    - Run: Run the crawler manually
    - Delete: Delete the crawler
    * When the crawler state is CREATING / ALTERING / DELETING / RUNNING, you cannot edit, run, or delete it.

View crawler details

Check detailed information about a crawler.

  1. Go to the KakaoCloud console > Analytics > Data Catalog.
  2. Click Crawlers, then select the crawler whose details you want to view.
  3. Review the crawler’s details.

View crawler run history

Check the run history of a crawler.

  1. Go to the KakaoCloud console > Analytics > Data Catalog.

  2. Click Crawlers, then select the crawler whose history you want to view.

  3. In the details page, click the Run history tab to review past runs.

    info

    Crawler run history is retained for up to 90 days. Records older than 90 days are deleted automatically.

    ColumnDescription
    Start timeWhen the crawl started
    End timeWhen the crawl finished
    DurationHow long the crawler ran
    StatusStatus of the run
    - Succeeded: The crawl finished successfully
    - Running: The crawl is in progress
    - Failed: The crawl failed

Delete a crawler

Delete crawlers you no longer need.

caution

Deleted crawlers and their run histories cannot be restored. If a catalog is deleted, its crawlers are deleted automatically.

  1. Go to the KakaoCloud console > Analytics > Data Catalog.
  2. Click Crawlers, then in the list click the [More] icon for the crawler you want to remove and select Delete.
  3. In the Delete dialog, enter the crawler name exactly and click Delete.