Key concepts
Kubeflow on KakaoCloud is an open-source platform that helps easily build and run machine learning workflows in a cloud-native environment. Built on Kubernetes, it leverages Kubernetes' cluster management capabilities to simplify and streamline the management of ML workflows.
It provides key components for ML workflows such as data preprocessing, model training, and model serving, allowing developers to quickly and easily build ML models using a consistent interface and high-level abstractions.
Resource structure
Resource structure
Item | Description |
---|---|
⓵ Cloud | Runs in the KakaoCloud environment Utilizes cloud infrastructure to simplify ML tasks |
⓶ Kubernetes engine and cluster | Kubeflow is based on Kubernetes engine and cluster Enables easier ML model development, deployment, and management |
⓷ Kubeflow application | Provides essential functionality for developing, tuning, deploying, and managing ML models |
⓸ Kubeflow scaffolding | Provides functions necessary for ML model deployment and management |
⓹ Machine learning tools | Supports various ML tools Users can choose preferred tools for ML tasks |
ML workflow
Phase | Step | Description |
---|---|---|
Experiment phase | Data collection and preprocessing | Collect and preprocess data for ML model training |
Data transformation | Transform data into a format that models can understand, reduce size, extract features | |
Model coding | Write code to develop models based on selected algorithms | |
Model training | Train models on data, try different hyperparameters and compare models | |
Hyperparameter tuning | Adjust hyperparameters to find optimal model | |
Production phase | Model training and evaluation | Train the final model and evaluate performance |
Model deployment | Deploy trained models to provide prediction services | |
Monitoring and management | Monitor performance and stability of models, respond to issues |
Kubeflow lifecycle and status
Kubeflow lifecycle
Status | Description | Label |
---|---|---|
Creating | Kubeflow resources are being created | Yellow |
Active | Kubeflow resources have been successfully created and are active | Green |
Failed | Creation failed or unexpected error occurred | Red |
Expired | Resources expired or associated cluster deleted | Red |
Terminating | Resources are being terminated | Yellow |
Terminated | Resources have been terminated | Gray |
For information about the status of clusters, node pools, and nodes used with Kubeflow, refer to Kubernetes Engine > Resource status information.
Kubeflow user and group status
Kubeflow user and group status
Status | Description | Label |
---|---|---|
Pending | User or group is being created/modified | Yellow |
Active | Creation/modification completed successfully | Green |
Failed | Failed due to error during user/group operations | Red |
Deleted | Deletion completed successfully | Not displayed |
Kubeflow component architecture
Components in Kubeflow are small modular units that manage the ML workflow. They offer flexibility, scalability, and automation for tasks such as model development and deployment.
Component type | Description |
---|---|
Dashboard | Web console for accessing Kubeflow components |
JupyterLab | Web-based ML development tool integrated with Kubeflow SDK |
Kubeflow Pipelines | Visual console for managing ML workflows like preprocessing, training, serving |
Katib | Hyperparameter optimization tool for model training in distributed environments |
KServe | Serves models trained in Kubeflow via REST API |
Training Operator | Supports distributed training for frameworks like TensorFlow, PyTorch |
Model Registry | Stores metadata for model registration and versioning, integrates with other components |
Supported components by version and service type
Version | Service type | Components and versions |
---|---|---|
1.6.0 | Essential + Hyper Param Tuning (HPT) + Serving API | - JupyterLab 3.2.9 - KF Pipelines 2.0.0-alpha.5 - Katib 0.15.0 - Tensorboard 2.1.0 - KServe 0.8.0 |
1.8.0 | Essential + Hyper Param Tuning (HPT) + Serving API | - JupyterLab 4.2.1 - KF Pipelines 2.0.5 - Katib 0.16.0 - Tensorboard 2.5.1 - KServe 0.11.2 |
Manage Kubeflow roles
Kubeflow roles define the level of access and control over the console, dashboard, and namespaces.
Roles include Owner, User, and Group User. A single user may have multiple roles.
Role types at the Kubeflow level
To become a Kubeflow Owner, IAM project member permissions are required.
Removing IAM permissions after becoming an Owner may prevent console and dashboard access.
Role | Description |
---|---|
Kubeflow owner | Automatically assigned to the user who creates the Kubeflow instance - Can manage users, namespaces, and groups - At least one owner is required per instance (max 5 allowed) - Requires IAM project manager/member privileges |
Kubeflow user | Regular user who can manage their namespace or participate in groups via the dashboard - Must be registered by a Kubeflow Owner or IAM admin - Cannot access the console without IAM permissions |
Kubeflow group user | User registered to a group, able to access the dashboard based on group permissions - Can be registered via console or dashboard by an Admin - Can belong to multiple groups with separate roles per group - Roles: Admin / Edit / View |
Console permissions by role
Roles other than Owner do not have console access. IAM project permissions are required. See IAM for more details.
Console permissions | Kubeflow owner |
---|---|
View Kubeflow details | ✓ |
Request deletion | ✓ |
Manage owners | ✓ |
Manage users | ✓ |
Manage groups | ✓ |
Manage group users | ✓ |
Dashboard permissions by role
A user’s role is valid only within their assigned namespace.
Users may have multiple roles across different namespaces. See below for permissions by role.
Dashboard permission | Owner | User | Group Admin | Group Edit | Group View |
---|---|---|---|---|---|
View other namespaces | ✓ | ||||
View own namespace | ✓ | ✓ | ✓ | ✓ | ✓ |
Manage group users | ✓ | ||||
View notebooks | ✓ | ✓ | ✓ | ✓ | ✓ |
Create/delete/edit notebooks | ✓ | ✓ | ✓ | ✓ | |
View Tensorboards | ✓ | ✓ | ✓ | ✓ | ✓ |
Create/delete/edit Tensorboards | ✓ | ✓ | ✓ | ✓ | |
View pipelines | ✓ | ✓ | ✓ | ✓ | ✓ |
Create/delete/edit pipelines | ✓ | ✓ | ✓ | ✓ | |
View AutoML (Katib) | ✓ | ✓ | ✓ | ✓ | ✓ |
Create/delete/edit AutoML | ✓ | ✓ | ✓ | ✓ | |
View model serving (KServe) | ✓ | ✓ | ✓ | ✓ | ✓ |
Create/delete/edit model serving | ✓ | ✓ | ✓ | ✓ |