Setting up Kubernetes Engine cluster monitoring
This tutorial provides an example of setting up a monitoring environment for a Kubernetes cluster created in Kubernetes Engine. It explains how to receive cluster alerts via Slack using Alertmanager and monitor the cluster with Grafana dashboards.
- Estimated time: 30 minutes
- Recommended environment
- Operating system: macOS, Ubuntu
- Region: kr-central-1
- Prerequisites
- Access key
- A Slack account and workspace
- References
Prework
Install kubectl
Install kubectl
using the following command:
brew install kubectl
Install helm
Enter the command brew install helm
in the terminal to install the helm package. Helm is a Kubernetes package manager that allows you to search for, share, and use software for Kubernetes.
brew install helm
Create Kubernetes cluster
For detailed instructions on setting up a Kubernetes environment using Kubernetes Engine, refer to the Setting up Kubernetes cluster with Kubernetes Engine document.
-
Follow Step 1. Create Kubernetes cluster to create a cluster using Kubernetes Engine.
-
Proceed with Step 2. Call Kubernetes API with kubectl.
Procedure
Step 1. Set up Slack alert
-
Go to the Slack API site and click Create an app to create an app from scratch. Enter the app name and select the workspace during creation.
-
On the app's detailed page, navigate to the Incoming Webhooks tab on the left, enable Activate Incoming Webhooks, and click Add New Webhook to Workspace to add the channel where Slack will receive alerts.
-
After adding the channel, copy the generated Webhook URL by clicking the [Copy] button.
Step.2 Set up monitoring environment
Helm can be used to easily create and manage a Prometheus stack in a Kubernetes cluster. Modify the kube-prometheus-stack
configuration file to deploy the Prometheus stack, Grafana dashboard, and Alertmanager in the Kubernetes cluster.
Customize configuration file
-
Create a directory in your local environment for the example and move to that location.
mkdir ~/k8se-monitor
cd ~/k8se-monitor -
- Download the pre-configured
custom-values.yaml
file. This file includes configurations for both Alertmanager and Grafana.
curl -O https://raw.githubusercontent.com/kakaoenterprise/kc-handson-config/k8s-monitor/custom-values.yaml
- Download the pre-configured
-
You can modify the Slack alert settings, such as the alert frequency, message format, and notification channel, in the
alertmanager
field of the downloadedcustom-values.yaml
file. Enter the channel and Webhook URL values from Step 1 (Slack alert setup) into thechannel
field and theapi_url
field of the configuration file.# Line 157, 158
- channel: '#channel-name'
api_url: https://hooks.slack.com/services/...s/...- If additional Slack alert configurations are required, refer to the slack_config and modify the
alertmanager
field accordingly.
- If additional Slack alert configurations are required, refer to the slack_config and modify the
-
In the
additionalPrometheusRulesMap
field of the configuration file, you can add alert rules. The provided configuration file is set to trigger an alert when the cluster's memory usage exceeds 10%.- Add the following code to include a rule that triggers an alert when CPU usage exceeds 5%, as shown in the example below.
# Line 97~
additionalPrometheusRulesMap:
rule-name:
groups:
- name: Kubernetes Cluster monitoring
rules:
- alert: Cluster Memory Usage Over 10%
expr: sum (container_memory_working_set_bytes{kubernetes_io_hostname=~"^.*$"}) / sum (machine_memory_bytes{kubernetes_io_hostname=~"^.*$"}) * 100 > 10
for: 1m
annotations:
title: "Cluster Memory Usage Over 10%"
message: "Memory Usage: {{ $value }}"
labels:
severity: 'warning'
#### Add Rule ####
- alert: Cluster CPU Usage Over 5%
expr: sum (rate (container_cpu_usage_seconds_total{kubernetes_io_hostname=~"^.*$"}[2m])) / sum (machine_cpu_cores{kubernetes_io_hostname=~"^.*$"}) * 100 > 5
for: 1m
annotations:
title: "Cluster CPU Usage Over 5%"
message: "CPU Usage: {{ $value }}"
labels:
severity: 'warning'
Install kube-prometheus-stack
Apply the customized configuration file to install the kube-prometheus-stack
.
helm repo add prometheus-community \
https://prometheus-community.github.io/helm-charts
helm repo update
helm install prometheus prometheus-community/kube-prometheus-stack -f custom-values.yaml -n kube-system
Step 3. Verify slack alert reception
Access the Slack channel and check if the alerts are properly received according to the set rules.
Step 4. Verify grafana dashboard
You can visualize various cluster data in the Grafana dashboard.
Access grafana dashboard
-
Forward the port to access the Grafana dashboard. Use the kubectl command below to forward the local environment's port 30080 to the Grafana dashboard endpoint.
kubectl port-forward svc/prometheus-grafana -n kube-system 30080:80 &
-
Open a browser in your local environment and attempt to connect via HTTP to localhost on port 30080. If installed correctly, you will be directed to the Grafana dashboard.
The default Grafana account is as follows. Verify the account information and log in.
Key Value username admin password prom-operator -
After successfully logging in, you will see the Grafana home screen.
Create grafana data source
Within the Prometheus stack, prometheus-server collects node metrics from prometheus-node-exporter. In the Grafana dashboard, create a data source to query the metrics from prometheus-server. Refer to the table below to create the data source.
Key | Data |
---|---|
Name | ike-tutorial |
Url | http://prometheus-kube-prometheus-prometheus:9090 |
Check grafana dashboard
-
Download the dashboard configuration file,
dashboard.json
, to your local environment.curl -O https://raw.githubusercontent.com/kakaoenterprise/kc-handson-config/k8s-monitor/dashboard.json
-
In Grafana, click on the Dashboards tab on the left, then navigate to the Import Dashboard page. Click the [Upload JSON file] button and upload the
dashboard.json
file you downloaded. After uploading, select the Prometheus data source. For the Prometheus data source, choosek8se-tutorial
created in the Create Grafana data source step. -
Through the Grafana dashboard, you can view detailed information about the cluster's memory usage, CPU usage, and more.