Skip to main content

Key concepts

Kubeflow on KakaoCloud is an open-source platform that helps easily build and run machine learning workflows in a cloud-native environment. Built on Kubernetes, it leverages Kubernetes' cluster management capabilities to simplify and streamline the management of ML workflows.

It provides key components for ML workflows such as data preprocessing, model training, and model serving, allowing developers to quickly and easily build ML models using a consistent interface and high-level abstractions.

Resource structure

Image. KakaoCloud Kubeflow resource structure
Resource structure

ItemDescription
⓵ CloudRuns in the KakaoCloud environment
Utilizes cloud infrastructure to simplify ML tasks
⓶ Kubernetes engine and clusterKubeflow is based on Kubernetes engine and cluster
Enables easier ML model development, deployment, and management
⓷ Kubeflow applicationProvides essential functionality for developing, tuning, deploying, and managing ML models
⓸ Kubeflow scaffoldingProvides functions necessary for ML model deployment and management
⓹ Machine learning toolsSupports various ML tools
Users can choose preferred tools for ML tasks

ML workflow

Image. Kubeflow workflow

PhaseStepDescription
Experiment phaseData collection and preprocessingCollect and preprocess data for ML model training
Data transformationTransform data into a format that models can understand, reduce size, extract features
Model codingWrite code to develop models based on selected algorithms
Model trainingTrain models on data, try different hyperparameters and compare models
Hyperparameter tuningAdjust hyperparameters to find optimal model
Production phaseModel training and evaluationTrain the final model and evaluate performance
Model deploymentDeploy trained models to provide prediction services
Monitoring and managementMonitor performance and stability of models, respond to issues

Kubeflow lifecycle and status

Image. Kubeflow lifecycle
Kubeflow lifecycle

StatusDescriptionLabel
CreatingKubeflow resources are being createdYellow
ActiveKubeflow resources have been successfully created and are activeGreen
FailedCreation failed or unexpected error occurredRed
ExpiredResources expired or associated cluster deletedRed
TerminatingResources are being terminatedYellow
TerminatedResources have been terminatedGray
info

For information about the status of clusters, node pools, and nodes used with Kubeflow, refer to Kubernetes Engine > Resource status information.

Kubeflow user and group status

Image. Kubeflow user and group lifecycle

Kubeflow user and group status
StatusDescriptionLabel
PendingUser or group is being created/modifiedYellow
ActiveCreation/modification completed successfullyGreen
FailedFailed due to error during user/group operationsRed
DeletedDeletion completed successfullyNot displayed

Kubeflow component architecture

Components in Kubeflow are small modular units that manage the ML workflow. They offer flexibility, scalability, and automation for tasks such as model development and deployment.

Component typeDescription
DashboardWeb console for accessing Kubeflow components
JupyterLabWeb-based ML development tool integrated with Kubeflow SDK
Kubeflow PipelinesVisual console for managing ML workflows like preprocessing, training, serving
KatibHyperparameter optimization tool for model training in distributed environments
KServeServes models trained in Kubeflow via REST API
Training OperatorSupports distributed training for frameworks like TensorFlow, PyTorch
Model RegistryStores metadata for model registration and versioning, integrates with other components

Supported components by version and service type

Image. Components by service type

VersionService typeComponents and versions
1.6.0Essential + Hyper Param Tuning (HPT) + Serving API- JupyterLab 3.2.9
- KF Pipelines 2.0.0-alpha.5
- Katib 0.15.0
- Tensorboard 2.1.0
- KServe 0.8.0
1.8.0Essential + Hyper Param Tuning (HPT) + Serving API- JupyterLab 4.2.1
- KF Pipelines 2.0.5
- Katib 0.16.0
- Tensorboard 2.5.1
- KServe 0.11.2

Manage Kubeflow roles

Kubeflow roles define the level of access and control over the console, dashboard, and namespaces.
Roles include Owner, User, and Group User. A single user may have multiple roles.

Role types at the Kubeflow level

info

To become a Kubeflow Owner, IAM project member permissions are required.
Removing IAM permissions after becoming an Owner may prevent console and dashboard access.

RoleDescription
Kubeflow ownerAutomatically assigned to the user who creates the Kubeflow instance
- Can manage users, namespaces, and groups
- At least one owner is required per instance (max 5 allowed)
- Requires IAM project manager/member privileges
Kubeflow userRegular user who can manage their namespace or participate in groups via the dashboard
- Must be registered by a Kubeflow Owner or IAM admin
- Cannot access the console without IAM permissions
Kubeflow group userUser registered to a group, able to access the dashboard based on group permissions
- Can be registered via console or dashboard by an Admin
- Can belong to multiple groups with separate roles per group
- Roles: Admin / Edit / View

Console permissions by role

info

Roles other than Owner do not have console access. IAM project permissions are required. See IAM for more details.

Console permissionsKubeflow owner
View Kubeflow details
Request deletion
Manage owners
Manage users
Manage groups
Manage group users

Dashboard permissions by role

info

A user’s role is valid only within their assigned namespace.
Users may have multiple roles across different namespaces. See below for permissions by role.

Dashboard permissionOwnerUserGroup AdminGroup EditGroup View
View other namespaces
View own namespace
Manage group users
View notebooks
Create/delete/edit notebooks
View Tensorboards
Create/delete/edit Tensorboards
View pipelines
Create/delete/edit pipelines
View AutoML (Katib)
Create/delete/edit AutoML
View model serving (KServe)
Create/delete/edit model serving