Skip to main content

Kubeflow overview

Beta

Kubeflow on KakaoCloud is an open-source platform that helps build and run machine learning workflows easily in a cloud-native environment, based on Kubernetes Engine clusters. Kubeflow supports various ML libraries and frameworks that accelerate training processes, and provides a wide range of ML features that allow you to run ML workloads easily, portably, and scalably on Kubernetes. It also leverages features like Kubernetes cluster management and scaling to ensure stable and scalable ML workflow operations.

Terminology
  • Kubeflow: A portmanteau of Kubernetes + ML flow, it is an open-source platform for building and deploying machine learning pipelines. See the Kubeflow official documentation for more information.
  • Pipeline: In machine learning, a pipeline organizes the ML process step-by-step to streamline preprocessing and model training by connecting each stage.


Purpose and use cases

Training machine learning models involves multiple stages. When datasets exceed hundreds of GB, the process becomes more complex and time-consuming. Without automation, users must set up Kubernetes clusters and manually install and manage ML frameworks.

KakaoCloud's Kubeflow service simplifies these processes and enables easier and more efficient management of ML workloads. It enhances ML development productivity and allows for faster model development and deployment.
Kubeflow is suitable for use in domains where machine learning is essential, such as natural language processing, image processing, and recommendation systems.

Target users

Kubeflow can be used by a wide range of users including ML engineers, data scientists, developers, infrastructure engineers, and IT administrators.

Target userDescription
Data scientistsSupports quick and easy ML modeling. Users with experience in data analysis and modeling can build models more conveniently.
ML engineersHelps easily set up and manage ML frameworks and infrastructure.
Data engineersFacilitates big data processing and storage. Users with experience in these areas can simplify their data workflows using Kubeflow.
Cloud infrastructure engineersHelps build and manage ML workflows on Kubernetes clusters.

Features

Kubeflow is a free, Kubernetes-based open-source ML platform that provides tools for distributed ML tasks and supports setting up ML workflows in various environments. It allows easy scaling and management of ML tasks, supports various ML libraries and frameworks, and includes essential components for building workflows.

Kubernetes-based

  • Built on Kubernetes Engine
  • Supports resource management and features provided by Kubernetes

Support for various ML libraries and frameworks

  • Supports TensorFlow, PyTorch, XGBoost, and more
  • Provides a platform for easy deployment and management of frameworks
  • Easily configure and run workflows using Kubeflow Pipelines

Automated ML workflows

  • Automates the ML workflow for faster development and deployment
  • Provides automated steps for data preprocessing, model training, and deployment

Efficiency and scalability

  • Supports rapid and stable deployment and scaling in cloud environments
  • Uses standard methods to define each workflow stage, automates the ML pipeline, and enhances development/operational efficiency
  • All components are open-source, allowing customization to optimize workflows

Resource security and role-based access control

  • Assigns namespaces and resource quotas based on role and responsibility
  • Groups such as kbm-g can be used to collectively manage permissions by role and member

Getting started

For detailed usage instructions, refer to the How-to guides.
If you're new to KakaoCloud, visit Getting started with KakaoCloud.