Skip to main content

Training predictive model using Kubeflow pipelines

This guide introduces how to automate machine learning model training processes using pipelines in the KakaoCloud Kubeflow environment.

Basic information

Before starting

This tutorial introduces the core concepts and features of Kubeflow Pipelines through datasets and practical examples. It provides an example to understand the creation and combination of pipeline components, data processing, and model training processes. By following this guide, users can learn how machine learning workflows operate and how to automate them.

About this scenario

This tutorial explains how to create and use Kubeflow Pipelines to automate the predictive model training process step-by-step. The key topics include:

  • Understanding the basics of Kubeflow Pipelines and components
  • Creating and running pipelines
  • Managing model training processes using 'Experiment' and 'Run'

Supported tools

ToolVersionDescription
KF Pipelines2.0.5- A core component of Kubeflow that builds, deploys, and manages machine learning workflows.
- Provides a simplified interface for rapid experimentation and repeatable workflows.
- Offers features such as parameter tuning, experiment management, and model versioning.
info

For more details, refer to the Kubeflow > KF Pipelines official documentation.

Key Concepts

  • Component: Reusable task units that support various languages and libraries. Multiple components can be combined to configure an experiment.
  • Experiment: A collection of components that form a complete workflow. You can test various parameter and data combinations.
  • Run: Executes the experiment and tracks results for each stage. Failed tasks can be re-executed, and previous results can be reused.

Pipeline management

  • Pipeline components are visualized as a Directed Acyclic Graph (DAG).
  • All pipelines are managed as code through the SDK or by manually uploading compressed files.

Pipeline images

Kubeflow pipeline images supported by KakaoCloud include frameworks such as TensorFlow and PyTorch. You can also use custom Docker images as needed.

info

The image registry endpoint is bigdata-150.kr-central-2.kcr.dev/kc-kubeflow/(image_name).
For example, to pull the kmlp-tensorflow:1.0.0.py36.cpu image, use bigdata-150.kr-central-2.kcr.dev/kc-kubeflow/kmlp-tensorflow:1.0.0.py36.cpu.

Supported pipeline information
Image nameFrameworkVersionGPU support
kmlp-tensorflow:1.8.0.py38.cpu.1atensorflow2.13.1X
kmlp-tensorflow:1.8.0.py38.cuda.1atensorflow2.13.1O
kmlp-tensorflow:1.8.0.py311.cpu.1atensorflow2.15.1X
kmlp-tensorflow:1.8.0.py311.cuda.1atensorflow2.15.1O
kmlp-pytorch:1.8.0.py38.cpu.1apytorch2.3.0X
kmlp-pytorch:1.8.0.py38.cuda.1apytorch2.3.0O
kmlp-pytorch:1.8.0.py311.cpu.1apytorch2.3.0X
kmlp-pytorch:1.8.0.py311.cuda.1apytorch2.3.0O
kmlp-pyspark-tensorflow:1.8.0.py38.cpu.1atensorflow2.13.1X
kmlp-pyspark-tensorflow:1.8.0.py38.cuda.1atensorflow2.13.1O
kmlp-pyspark-tensorflow:1.8.0.py311.cpu.1atensorflow2.15.1X
kmlp-pyspark-tensorflow:1.8.0.py311.cuda.1atensorflow2.15.1O
kmlp-pyspark-pytorch:1.8.0.py38.cpu.1apytorch2.3.0X
kmlp-pyspark-pytorch:1.8.0.py38.cuda.1apytorch2.3.0O
kmlp-pyspark-pytorch:1.8.0.py311.cpu.1apytorch2.3.0X
kmlp-pyspark-pytorch:1.8.0.py311.cuda.1apytorch2.3.0O

Prework

1. Prepare training dataset

This tutorial uses the publicly available TLC Trip Record Data from New York City and example pipeline manifest files for simple preprocessing and training pipeline practice.

ItemDescription
GoalImplement a taxi fare prediction model
Data informationNYC Taxi and Limousine Commission's data from 2009 to 2015
- Contains pickup/dropoff times and locations, trip distance, fare amount, payment type, passenger count, etc.

Original dataset information

2. Prepare the Kubeflow environment

This tutorial uses a GPU node pool environment.

If the Kubeflow service or the appropriate environment is not set up, refer to the Create and manage Kubeflow document to create Kubeflow.


Step-by-step process

The detailed steps for creating an 'Experiment' and 'Run' and building a training pipeline in the Kubeflow environment are as follows.

Step 1. Create pipeline

Learn how to create a pipeline using the provided manifest file.

  1. After accessing the Kubeflow dashboard, click the Pipelines tab and then the [Upload pipeline] button.

    Image. Access the Pipelines tab in the Kubeflow dashboard Access the Pipelines tab in the Kubeflow dashboard

  2. Select the Upload a file option and upload the practice manifest file (zip) downloaded in Step 1: Prerequisites.

    Image. Upload a pipeline Upload a pipeline

info

For more detailed instructions on creating a pipeline, refer to the Kubeflow > Kubeflow Pipeline > Quick Start document.

Step 2. Create Experiment

Create an Experiment in the Experiments (KFP) tab or the pipeline detail page within the Pipelines tab of the Kubeflow dashboard.

  1. After accessing the Kubeflow dashboard, select the pipeline from the Pipelines tab where you want to create the experiment.

  2. Click the [Create experiment] button on the pipeline detail page.

    이미지. 파이프라인 상세화면 파이프라인 상세화면

  3. Enter the Experiment name and click [Next].

Step 3. Create and manage Run

After creating an Experiment, proceed to the Run creation phase.

info

If you want to create a Run later, use one of the following methods:

  • Click the [Create run] button in the Runs tab.
  • Access the pipeline detail page in the Pipelines tab and click [Create run].
  • Access the experiment detail page

in the Experiments (KFP) tab and click [Create run].

  1. In the Start a run screen, fill in the necessary information and click [Start].

    • In this practice, all fields are automatically filled since the manifest file was uploaded in Step 2.

    Image. Start a run Start a run

  2. Navigate to the Runs tab and select the created Run to access its details. You can check detailed information about the Run on this page.

    Image. Run details page Run details page

Step 4. Manage Run results

  1. In the Runs tab of the Kubeflow dashboard, select the Run you want to archive from the list, and click the [Archive] button.

    Image. Archive a run Archive a run

  2. Archived runs can be viewed in the Archived section of the Runs tab. You can restore a run by selecting it and clicking the [Restore] button.

    Image. Restore a run Restore a run

Step 5. Delete Run

To remove resources after completion or non-use of a service, follow the steps below to delete the Run.

  1. In the Runs tab of the Kubeflow dashboard, select the Run you want to delete, and click the [Archive] button.

    Image. Archive a run Archive a run

  2. Archived runs can be viewed in the Archived section of the Runs tab. You can delete the Run by selecting it and clicking the [Delete] button.

    Image. Delete a run Delete a run

  3. Upon deleting the run, verify that the pods have been deleted.

    Image. Confirm run deletion Confirm run deletion

Step 6. Archive Experiment

  1. In the Experiments (KFP) tab of the Kubeflow dashboard, select the experiment you want to archive from the list.

  2. On the experiment detail page, click the [Archive] button at the top right.

    Image. Archive an experiment Archive an experiment

  3. Archived experiments can be viewed in the Archived section of the Experiments tab. You can restore an experiment by selecting it and clicking the [Restore] button.

    Image. Restore an experiment Restore an experiment

Step 7. Delete pipeline

To remove unused pipelines, follow the steps below to delete a pipeline.

  1. In the Pipelines tab of the Kubeflow dashboard, select the pipeline you want to delete.

  2. Select the pipeline from the list, and click the [Delete] button at the top right to delete it.

    Image. Delete a pipeline Delete a pipeline

info

For more detailed instructions, refer to the Kubeflow Pipelines documentation.