Kubeflow overview
Kubeflow on KakaoCloud is an open-source platform that helps build and run machine learning workflows easily in a cloud-native environment, based on Kubernetes Engine clusters. Kubeflow supports various ML libraries and frameworks that accelerate training processes, and provides a wide range of ML features that allow you to run ML workloads easily, portably, and scalably on Kubernetes. It also leverages features like Kubernetes cluster management and scaling to ensure stable and scalable ML workflow operations.
- Kubeflow: A portmanteau of Kubernetes + ML flow, it is an open-source platform for building and deploying machine learning pipelines. See the Kubeflow official documentation for more information.
- Pipeline: In machine learning, a pipeline organizes the ML process step-by-step to streamline preprocessing and model training by connecting each stage.
Purpose and use cases
Training machine learning models involves multiple stages. When datasets exceed hundreds of GB, the process becomes more complex and time-consuming. Without automation, users must set up Kubernetes clusters and manually install and manage ML frameworks.
KakaoCloud's Kubeflow service simplifies these processes and enables easier and more efficient management of ML workloads. It enhances ML development productivity and allows for faster model development and deployment.
Kubeflow is suitable for use in domains where machine learning is essential, such as natural language processing, image processing, and recommendation systems.
Target users
Kubeflow can be used by a wide range of users including ML engineers, data scientists, developers, infrastructure engineers, and IT administrators.
Target user | Description |
---|---|
Data scientists | Supports quick and easy ML modeling. Users with experience in data analysis and modeling can build models more conveniently. |
ML engineers | Helps easily set up and manage ML frameworks and infrastructure. |
Data engineers | Facilitates big data processing and storage. Users with experience in these areas can simplify their data workflows using Kubeflow. |
Cloud infrastructure engineers | Helps build and manage ML workflows on Kubernetes clusters. |
Features
Kubeflow is a free, Kubernetes-based open-source ML platform that provides tools for distributed ML tasks and supports setting up ML workflows in various environments. It allows easy scaling and management of ML tasks, supports various ML libraries and frameworks, and includes essential components for building workflows.
Kubernetes-based
- Built on Kubernetes Engine
- Supports resource management and features provided by Kubernetes
Support for various ML libraries and frameworks
- Supports TensorFlow, PyTorch, XGBoost, and more
- Provides a platform for easy deployment and management of frameworks
- Easily configure and run workflows using Kubeflow Pipelines
Automated ML workflows
- Automates the ML workflow for faster development and deployment
- Provides automated steps for data preprocessing, model training, and deployment
Efficiency and scalability
- Supports rapid and stable deployment and scaling in cloud environments
- Uses standard methods to define each workflow stage, automates the ML pipeline, and enhances development/operational efficiency
- All components are open-source, allowing customization to optimize workflows
Resource security and role-based access control
- Assigns namespaces and resource quotas based on role and responsibility
- Groups such as kbm-g can be used to collectively manage permissions by role and member
Getting started
For detailed usage instructions, refer to the How-to guides.
If you're new to KakaoCloud, visit Getting started with KakaoCloud.