Skip to main content

Key Concepts

Kubeflow on KakaoCloud is an open-source platform that helps easily build and run machine learning workflows in a cloud-native environment. Built on Kubernetes, it leverages Kubernetes' cluster management capabilities to simplify and streamline the management of ML workflows.

It provides key components for ML workflows such as data preprocessing, model training, and model serving, allowing developers to quickly and easily build ML models using a consistent interface and high-level abstractions.

Resource structure

KakaoCloud Kubeflow resource structure
Resource structure

ItemDescription
⓵ CloudRuns in KakaoCloud environment
Enables easier execution of machine learning tasks by leveraging cloud resources
⓶ Kubernetes engine and clusterKubeflow is based on Kubernetes engine and cluster
Facilitates development, deployment, and management of ML models
⓷ Kubeflow applicationProvides essential features for developing, tuning, deploying, and managing ML models
⓸ Kubeflow scaffoldingSupports deployment and management of ML models
⓹ Machine learning toolsKubeflow supports various ML tools
Users can choose preferred tools for ML tasks

ML workflow

Kubeflow workflow

ItemCategoryDescription
Experiment phaseData collection and preprocessingCollect and preprocess data for ML model training
Data transformationConvert data into model-readable formats, reduce size, extract features for ML training
Write model codeWrite code to develop model based on selected ML algorithm
Model trainingTrain model using training data, modify hyperparameters to generate and compare model versions
Hyperparameter tuningTune model hyperparameters to find the optimal configuration
Production phaseTrain model and evaluate performanceTrain and evaluate selected model for performance
Deploy modelDeploy trained model to provide prediction services
Monitor and manage modelMonitor model performance and stability, respond to issues when needed

Kubeflow lifecycle and status

Kubeflow lifecycle
Kubeflow lifecycle

StatusDescriptionCategory
CreatingCreating Kubeflow resourceYellow
ActiveKubeflow resource is activeGreen
FailedResource creation failed or unexpected error occurredRed
ExpiredResource expired or associated cluster deletedRed
TerminatingResource is being terminatedYellow
TerminatedResource has been terminated (deleted)Gray
info

For the status of connected clusters, node pools, and nodes, refer to Kubernetes engine > Resource status information.

Kubeflow user and group status

Kubeflow user/group lifecycle

Kubeflow user and group status
StatusDescriptionCategory
PendingCreating/updating user or groupYellow
ActiveCreation/update completed successfullyGreen
FailedCreation/update failed or unexpected error occurredRed
DeletedDeletion completed successfullyNot displayed

Kubeflow component architecture

Kubeflow components are modular and essential for managing ML workflows. They enhance flexibility and scalability and automate tasks for ML model development and deployment.

Component typeDescription
DashboardWeb console to access Kubeflow components
JupyterLabWeb-based ML development tool integrated with Kubeflow SDK
Kubeflow PipelinesVisual console to manage ML workflows such as preprocessing, training, and serving
KatibHyperparameter tuning for model training
- Supports distributed training and optimal model discovery
KServeModel deployment and inference component
- Supports model serving and REST API-based inference
TrainerModel training component supporting distributed learning with frameworks like TensorFlow and PyTorch
Model registryModel registration and versioning
- Stores metadata and integrates with other components
Spark operatorDeclarative execution of Apache Spark applications
- Automates spark-submit, supports scheduling, retries, and monitoring

Supported components by version and service type

Supported components by service type

VersionService typeComponents and versions
1.8.0Essential + Hyper param tuning (HPT) + Serving API- JupyterLab 4.2.1
- KF Pipelines 2.0.5
- Trainer v1-855e096
- Katib 0.16.0
- Tensorboard 2.5.1
- KServe 0.11.2
- Model registry 0.2.5-alpha
1.10.0Essential + Hyper param tuning (HPT) + Serving API- JupyterLab 4.3.5
- KF Pipelines 2.4.1
- Trainer v1-3f15cb
- Katib 0.18.0
- Tensorboard 2.5.1
- KServe 0.15.0
- Model registry 0.2.19
- Spark operator 2.1.0 (supports Spark 2.3 and above)

Manage Kubeflow roles

Kubeflow roles grant different levels of access to console, dashboard, and namespaces.
Roles include Owner, User, and Group user. A single user may have multiple roles.

Role types at the Kubeflow level

info

To obtain the Owner role, you must have at least IAM project member permission.
After being assigned as Owner, revoking IAM project access may result in loss of console and dashboard access.

RoleDescription
Kubeflow ownerHighest-level role automatically assigned to the user who creates Kubeflow
- Manages users, namespaces, and groups via console
- Each Kubeflow must have at least one owner (up to 5 owners can be assigned)
- Requires project admin/member IAM permission
Kubeflow userStandard user who can manage owned namespaces and participate in groups
- Must be registered by Owner or Org Admin via console
- No console access except dashboard
- Can be promoted to Owner if they own a namespace and have IAM permissions
Kubeflow group userUser added to a group; uses dashboard based on group permissions
- Registered by Owner, Org Admin, or group Admin via dashboard
- Can belong to multiple groups
- Role types: Admin / Edit / View

Console permissions by role

info

Except for Owners, all other roles lack console access. IAM project permissions are required for console access.
Refer to IAM for more information.

Console permissionKubeflow owner
View Kubeflow details
Request Kubeflow deletion
Add/edit/delete owner
Add/edit/delete user
Create/edit/delete group
Add/edit/delete group user

Dashboard permissions by role

info

User roles are valid within assigned namespaces.
Users may hold multiple roles across namespaces. See the table below for role-based dashboard permissions.

Dashboard permissionOwnerUserGroup adminGroup editGroup view
View other namespaces
View own namespace
Manage group users
View notebooks
Create/delete/edit notebooks
View tensorboard
Create/delete/edit tensorboard
View pipelines
Create/delete/edit pipelines
View AutoML (Katib)
Create/delete/edit AutoML (Katib)
View model serving (KServe)
Create/delete/edit model serving (KServe)
View model registry
Create/delete/edit model registry