Predictive model training in Kubeflow Pipelines
This tutorial introduces how to automate machine learning model training using the Kubeflow service on KakaoCloud.
- Estimated time: 10 minutes
- Recommended OS: MacOS, Ubuntu
- Notes:
- In a private network environment, file downloads may not work properly.
About this scenario
This scenario explains the core concepts and functionalities of Kubeflow Pipelines using training data and a hands-on example. You will learn how to create and combine pipeline components, and automate workflows for data processing and model training. It especially focuses on automating the model training process step by step with Kubeflow Pipelines, helping you build and operate efficient workflows.
Key topics include:
- Understanding the basics of Kubeflow Pipelines and its components
- Creating and running pipelines
- Managing the model training process with Experiments and Runs
Supported tools
Tool | Version | Description |
---|---|---|
KF Pipelines | 2.0.5 | - A core component of Kubeflow that helps build, deploy, and manage machine learning workflows. - Supports fast experimentation and repeatable ML workflows through a simplified interface. - Offers parameter tuning, experiment tracking, and model versioning features. |
For more details on KF Pipelines, refer to the Kubeflow > KF Pipelines official documentation.
Key concepts
- Component: A reusable task unit that supports various languages and libraries. You can combine multiple components to create an experiment.
- Experiment: A full workflow composed of connected components. You can test combinations of parameters and data.
- Run: Executes an Experiment and tracks results of each step. You can retry failed tasks or reuse results from previous runs.
Pipeline management
- Pipeline components are visualized in the form of a Directed Acyclic Graph (DAG).
- All pipelines can be managed via code, either through SDK or by manually uploading compressed files.
Pipeline images
KakaoCloud provides Kubeflow pipeline images that include various ML frameworks such as TensorFlow and PyTorch. You can also use your own custom Docker images.
The image registry endpoint is bigdata-150.kr-central-2.kcr.dev/kc-kubeflow/(image-name)
.
For example, to pull the image kmlp-tensorflow:1.0.0.py36.cpu
, use:
bigdata-150.kr-central-2.kcr.dev/kc-kubeflow/kmlp-tensorflow:v1.0.0.py36.cpu
Supported pipeline images
Image name | Framework | Version | GPU Supported |
---|---|---|---|
kmlp-tensorflow:v1.8.0.py38.cpu.1a | tensorflow | 2.13.1 | X |
kmlp-tensorflow:v1.8.0.py38.cuda.1a | tensorflow | 2.13.1 | O |
kmlp-tensorflow:v1.8.0.py311.cpu.1a | tensorflow | 2.15.1 | X |
kmlp-tensorflow:v1.8.0.py311.cuda.1a | tensorflow | 2.15.1 | O |
kmlp-pytorch:v1.8.0.py38.cpu.1a | pytorch | 2.3.0 | X |
kmlp-pytorch:v1.8.0.py38.cuda.1a | pytorch | 2.3.0 | O |
kmlp-pytorch:v1.8.0.py311.cpu.1a | pytorch | 2.3.0 | X |
kmlp-pytorch:v1.8.0.py311.cuda.1a | pytorch | 2.3.0 | O |
kmlp-pyspark-tensorflow:v1.8.0.py38.cpu.1a | tensorflow | 2.13.1 | X |
kmlp-pyspark-tensorflow:v1.8.0.py38.cuda.1a | tensorflow | 2.13.1 | O |
kmlp-pyspark-tensorflow:v1.8.0.py311.cpu.1a | tensorflow | 2.15.1 | X |
kmlp-pyspark-tensorflow:v1.8.0.py311.cuda.1a | tensorflow | 2.15.1 | O |
kmlp-pyspark-pytorch:v1.8.0.py38.cpu.1a | pytorch | 2.3.0 | X |
kmlp-pyspark-pytorch:v1.8.0.py38.cuda.1a | pytorch | 2.3.0 | O |
kmlp-pyspark-pytorch:v1.8.0.py311.cpu.1a | pytorch | 2.3.0 | X |
kmlp-pyspark-pytorch:v1.8.0.py311.cuda.1a | pytorch | 2.3.0 | O |
Before you start
1. Prepare training dataset
This tutorial uses TLC Trip Record Data from New York City and a sample pipeline manifest file for a simple preprocessing and training pipeline exercise.
Item | Description |
---|---|
Goal | Build a taxi fare prediction model |
Data | NYC Yellow Taxi fare data (2009–2015), including pickup/drop-off time and location, trip distance, fare, payment type, passenger count, etc. |
2. Prepare Kubeflow environment
This tutorial uses a GPU node pool environment.
If you haven't set up Kubeflow yet, follow the Kubeflow setup guide to create the environment.
Getting started
Here's how to create an Experiment and Run in Kubeflow and build the training pipeline:
Step 1. Create pipeline
Instructions for creating a pipeline using the sample manifest file:
-
Access the Kubeflow dashboard and click the Pipelines tab. Then, click [Upload pipeline].
-
Select Upload a file and upload the
.yaml
file you downloaded in Step 1.
For more details, see Kubeflow > Kubeflow Pipeline > Quick Start.
Step 2. Create Experiment
Create an Experiment from either the Experiments (KFP) tab or the details page of a specific pipeline in the Pipelines tab.
-
Access the Kubeflow dashboard and select the pipeline where you want to create an Experiment from the Pipelines tab.
-
In the pipeline detail view, click [Create experiment].
-
Enter the Experiment name and click [Next].
Step 3. Create and manage Run
After creating the Experiment, you will proceed to the Run creation step.
If you want to create a Run later, you can do so using one of the following methods:
- Click [Create run] from the Runs tab.
- In the Pipelines tab, go to the pipeline detail page and click [Create run].
- In the Experiments (KFP) tab, go to the experiment detail page and click [Create run].
-
On the Start a run screen, enter the required information and click [Start].
- Since the manifest file was uploaded in Step 2, all values will be auto-filled in this screen.
-
Move to the Runs tab and select the created Run to access the detailed view. You can check all Run-related information on this screen.
Step 4. Manage run results
-
In the Kubeflow dashboard, go to the Runs tab, select the Run to archive, and click [Archive].
-
Archived runs can be found under the “Archived” filter in the Runs tab. To restore a Run, select it and click [Restore].
Step 5. Delete a Run
Once the experiment is complete or no longer needed, it's good practice to delete unused resources:
-
In the Runs tab, select the Run to delete and click [Archive].
-
From the Archived tab, select the Run and click [Delete].
-
You can verify that the corresponding pod has also been deleted.
Step 6. Archive an Experiment
-
Go to the Experiments (KFP) tab in the Kubeflow dashboard and select the Experiment to archive.
-
In the detail view, click [Archive] in the upper-right corner.
-
You can view archived experiments in the “Archived” section of the Experiments tab. To restore one, click [Restore].
Step 7. Delete a pipeline
After completing the tutorial or if a pipeline is no longer used, delete it as follows:
-
Access the Pipelines tab in the Kubeflow dashboard.
-
From the list view, select the pipeline to delete and click the [Delete] button in the top-right corner.
For more details on Kubeflow Pipelines, see the official Kubeflow Pipelines documentation.