Skip to main content

Setting up Kubernetes Engine cluster monitoring

This tutorial provides an example of setting up a monitoring environment for a Kubernetes cluster created in Kubernetes Engine. It explains how to receive cluster alerts via Slack using Alertmanager and monitor the cluster with Grafana dashboards.

Basic information

Prework

Install kubectl

Install kubectl using the following command:

Install kubectl
brew install kubectl

Install helm

Enter the command brew install helm in the terminal to install the helm package. Helm is a Kubernetes package manager that allows you to search for, share, and use software for Kubernetes.

Installing helm
brew install helm

Create Kubernetes cluster

For detailed instructions on setting up a Kubernetes environment using Kubernetes Engine, refer to the Setting up Kubernetes cluster with Kubernetes Engine document.

  1. Follow Step 1. Create Kubernetes cluster to create a cluster using Kubernetes Engine.

  2. Proceed with Step 2. Call Kubernetes API with kubectl.

Procedure

Step 1. Set up Slack alert

  1. Go to the Slack API site and click Create an app to create an app from scratch. Enter the app name and select the workspace during creation.

  2. On the app's detailed page, navigate to the Incoming Webhooks tab on the left, enable Activate Incoming Webhooks, and click Add New Webhook to Workspace to add the channel where Slack will receive alerts.

  3. After adding the channel, copy the generated Webhook URL by clicking the [Copy] button.

Step.2 Set up monitoring environment

Helm can be used to easily create and manage a Prometheus stack in a Kubernetes cluster. Modify the kube-prometheus-stack configuration file to deploy the Prometheus stack, Grafana dashboard, and Alertmanager in the Kubernetes cluster.

Customize configuration file
  1. Create a directory in your local environment for the example and move to that location.

    mkdir ~/k8se-monitor
    cd ~/k8se-monitor
    1. Download the pre-configured custom-values.yaml file. This file includes configurations for both Alertmanager and Grafana.
    curl -O https://raw.githubusercontent.com/kakaoenterprise/kc-handson-config/k8s-monitor/custom-values.yaml
  2. You can modify the Slack alert settings, such as the alert frequency, message format, and notification channel, in the alertmanager field of the downloaded custom-values.yaml file. Enter the channel and Webhook URL values from Step 1 (Slack alert setup) into the channel field and the api_url field of the configuration file.

    # Line 157, 158
    - channel: '#channel-name'
    api_url: https://hooks.slack.com/services/...s/...
    • If additional Slack alert configurations are required, refer to the slack_config and modify the alertmanager field accordingly.
  3. In the additionalPrometheusRulesMap field of the configuration file, you can add alert rules. The provided configuration file is set to trigger an alert when the cluster's memory usage exceeds 10%.

    • Add the following code to include a rule that triggers an alert when CPU usage exceeds 5%, as shown in the example below.
    # Line 97~
    additionalPrometheusRulesMap:
    rule-name:
    groups:
    - name: Kubernetes Cluster monitoring
    rules:
    - alert: Cluster Memory Usage Over 10%
    expr: sum (container_memory_working_set_bytes{kubernetes_io_hostname=~"^.*$"}) / sum (machine_memory_bytes{kubernetes_io_hostname=~"^.*$"}) * 100 > 10
    for: 1m
    annotations:
    title: "Cluster Memory Usage Over 10%"
    message: "Memory Usage: {{ $value }}"
    labels:
    severity: 'warning'
    #### Add Rule ####
    - alert: Cluster CPU Usage Over 5%
    expr: sum (rate (container_cpu_usage_seconds_total{kubernetes_io_hostname=~"^.*$"}[2m])) / sum (machine_cpu_cores{kubernetes_io_hostname=~"^.*$"}) * 100 > 5
    for: 1m
    annotations:
    title: "Cluster CPU Usage Over 5%"
    message: "CPU Usage: {{ $value }}"
    labels:
    severity: 'warning'
Install kube-prometheus-stack

Apply the customized configuration file to install the kube-prometheus-stack.

helm repo add prometheus-community \
https://prometheus-community.github.io/helm-charts
helm repo update

helm install prometheus prometheus-community/kube-prometheus-stack -f custom-values.yaml -n kube-system

Step 3. Verify slack alert reception

Access the Slack channel and check if the alerts are properly received according to the set rules.

Step 4. Verify grafana dashboard

You can visualize various cluster data in the Grafana dashboard.

Access grafana dashboard

  1. Forward the port to access the Grafana dashboard. Use the kubectl command below to forward the local environment's port 30080 to the Grafana dashboard endpoint.

    kubectl port-forward svc/prometheus-grafana -n kube-system 30080:80 &
  2. Open a browser in your local environment and attempt to connect via HTTP to localhost on port 30080. If installed correctly, you will be directed to the Grafana dashboard.

    The default Grafana account is as follows. Verify the account information and log in.

    KeyValue
    usernameadmin
    passwordprom-operator
  3. After successfully logging in, you will see the Grafana home screen.

Create grafana data source

Within the Prometheus stack, prometheus-server collects node metrics from prometheus-node-exporter. In the Grafana dashboard, create a data source to query the metrics from prometheus-server. Refer to the table below to create the data source.

KeyData
Nameike-tutorial
Urlhttp://prometheus-kube-prometheus-prometheus:9090

데이터 소스 생성

Check grafana dashboard

  1. Download the dashboard configuration file, dashboard.json, to your local environment.

    curl -O https://raw.githubusercontent.com/kakaoenterprise/kc-handson-config/k8s-monitor/dashboard.json
  2. In Grafana, click on the Dashboards tab on the left, then navigate to the Import Dashboard page. Click the [Upload JSON file] button and upload the dashboard.json file you downloaded. After uploading, select the Prometheus data source. For the Prometheus data source, choose k8se-tutorial created in the Create Grafana data source step.

    Dashboard creation

  3. Through the Grafana dashboard, you can view detailed information about the cluster's memory usage, CPU usage, and more.