Monitoring Troubleshooting
This document outlines common issues and solutions related to the Monitoring service.
Monitoring agent errors
Not found file in path. Please check path and permission
After installing the monitoring agent, the following error may appear when checking logs using the sudo journalctl -u kic_monitor_agent -f
command:
Jul 18 09:15:35 host-172-16-2-147 kic_monitor_agent[10046]: 2024-07-18T09:15:35Z W! [inputs.tail] Not found file in path. Please Check path and permission: {specified log file}
This error occurs due to issues with the log file path or permissions. The main causes are as follows:
Cause 1: Incorrect log file path or name
▶️ Solution: Enter the correct file path and name in the KIC_LOG_FILE_PATH
in /etc/default/kic_monitor_agent
, then restart the monitoring agent.
- Modify the log file path in
/etc/kic_monitor_agent
.
# Update the log file path accessible by the Log Explorer
KIC_LOG_FILE_PATH=""
- Restart the monitoring agent.
sudo systemctl restart kic_monitor_agent
- Check whether the agent is running normally.
sudo journalctl -u kic_monitor_agent -f
Cause 2: The service user was manually changed after installing the agent
If the default service user (root) was changed after installation, permission issues may occur. There are two solutions:
▶️ Solution 1: Change the service user of the monitoring agent to root
[Service]
EnvironmentFile=-/etc/default/kic_monitor_agent
User=root # change to root
▶️ Solution 2: Grant read
permission to the log file if the service user lacks access
# Grant read permission to user, group, and others
sudo chmod 444 {log file}
Collection took longer than expected; not complete after interval of 10s
If the following error occurs on an instance where the monitoring agent is installed, it indicates that the agent failed to collect data due to a short interval setting:
Apr 01 11:03:20 ${affected-instance} kic_monitor_agent[52839]: 2024-04-01T02:03:20Z W! [inputs.disk] Collection took longer than expected; not complete after interval of 10s
▶️ Solution: Increase the data collection interval to a value greater than the default (10s
).
- In
/etc/kic_monitor_agent/kic_monitor_agent.conf
, change theinterval
setting and save the file.
[agent]
## Default data collection interval for all inputs
interval = "30s" # must be greater than default (10s)
- Restart the monitoring agent.
sudo systemctl restart kic_monitor_agent
- Check if the agent is running properly.
sudo journalctl -u kic_monitor_agent -f
Errors when using Metric Export API
Default Error Page: Page not found
This error may occur if the Metric Export API is called from an invalid environment.
<html>
<head>
<meta charset="UTF-8">
<title>
Default Error Page
</title>
</head>
<body>
<div style="text-align:center">
<h1 className="header">Page not found.</h1>
<p>
<span style="color: gray;">
The requested address may have been changed or deleted, or is incorrect.
</span>
<br/>
<span style="color: gray;">Please check if the URL is correct.</span>
</p>
</div>
</body>
</html>
Cause 1: The client is not a Public VM or not in a VPC with Internet Gateway
▶️ Solution: Ensure that the client calling the Metric Export API is either a VM with a public IP or a VM created in a VPC with an Internet Gateway.
Cause 2: User not part of the project
▶️ Solution: Invite the user as a member of the project to which the Metric Export API client belongs.
Cause 3: Invalid or expired access key
▶️ Solution: Ensure the access key ID and secret access key are entered correctly, and verify whether the key has expired.
You can manage access keys via the Access Key menu in the top-right profile menu of the KakaoCloud Console. For details, refer to the Get access key guide.
not supported kakaocloud service-type header.
This error occurs when the service-type
header value is invalid.
{"status":"error","errorType":"bad_data","error":"not supported kakaocloud service-type header. current header value is {invalid service-type}"}
▶️ Solution: Make sure the service-type
header is entered correctly.
curl -X GET 'https://monitoring.kr-central-2.kakaocloud.com/metric-export/grafana/{PROJECT_ID}/prometheus/api/' \
-H "Content-Type: application/x-www-form-urlencoded" \
-H "Credential-ID: {ACCESS_KEY_ID}" \
-H "Credential-Secret: {ACCESS_KEY_SECRET}" \
-H "service-type: server"
Supported service-type
values:
- Virtual Machine, GPU, Bare Metal Server:
server
- MySQL:
mysql
- MemStore:
memstore
- Load Balancing:
lb
No result in response (data not available)
The following response may occur if a metric that doesn't match the service-type
was queried:
{"status":"success","data":{"resultType":"vector","result":[]}}
Cause 1: Metric does not match the specified service-type
Incorrect example:
curl -vvv -X POST 'https://monitoring.kr-central-2.kakaocloud.com/metric-export/grafana/{PROJECT_ID}/prometheus/api/v1/query' \
-H "Content-Type: application/x-www-form-urlencoded" \
-H "Credential-ID: {ACCESS_KEY_ID}" \
-H "Credential-Secret: {ACCESS_KEY_SECRET}" \
-H "service-type: server" \
-d "query=lb_bytes_in_persec"
▶️ Solution: Use a metric appropriate for the service-type
. Refer to the Monitoring Metrics documentation.
curl -vvv -X POST 'https://monitoring.kr-central-2.kakaocloud.com/metric-export/grafana/{PROJECT_ID}/prometheus/api/v1/query' \
-H "Content-Type: application/x-www-form-urlencoded" \
-H "Credential-ID: {ACCESS_KEY_ID}" \
-H "Credential-Secret: {ACCESS_KEY_SECRET}" \
-H "service-type: server" \
-d "query=cpu_usage"
Cause 2: Monitoring agent not installed on the instance
If metrics cannot be retrieved for a specific instance in Grafana, it may be because the monitoring agent is not installed on that instance.
▶️ Solution: Refer to the Install monitoring agent guide to install the agent and check the metrics again.
For Load Balancing, MySQL, and MemStore services, metrics can be checked without installing the monitoring agent.