From “No Data” to Full Visibility: Debugging PMM Metrics

Introduction

In the world of database observability, timing is everything — literally.

Recently, we ran into a puzzling issue while managing MySQL monitoring in Percona Monitoring and Management (PMM). Out of five database servers (server1 through server4), one server i.e. server4 stubbornly refused to show metrics for time ranges less than 24 hours. When we selected “Last 24 hours”, everything looked fine. But when we switched to “Last 12 hours” or “Last 6 hours”, PMM threw up the dreaded message: “No data”.

At first glance, this seemed like a PMM agent issue or perhaps a problem in Prometheus scraping. But as we soon discovered, the real culprit was far simpler and far sneakier: system time drift.

In this blog, I’ll walk you through how we diagnosed and resolved this issue, what caused it, and why NTP (Network Time Protocol) deserves a permanent spot in your DevOps checklist. We’ll also discuss preventive strategies and observability best practices to ensure you never lose metrics again due to time mismatch.

The Setup: PMM for MySQL Monitoring

Server1, Server2, Server3 and Server4.

Each host runs:

MySQL 8.x
PMM Agent configured with the node and MySQL exporter
Metrics collected by Prometheus (PMM Server)
Dashboards viewed via Grafana

For months, everything worked flawlessly, until Server4 went silent for shorter time ranges.

The Symptom

When opening the PMM dashboard:

Last 24 hours: Data visible
Last 12 hours / 6 hours / 1 hour: “No data”

Everything else (queries, connections, replication, etc.) appeared normal. The PMM agent was running fine, and no scrape errors were visible in Prometheus targets.

Here’s what the situation looked like:

Server	PMM data (24h)	PMM data (6h)	Status
Server1	Visible	Visible	Ok
Server2	Visible	Visible	Ok
Server3	Visible	Visible	Ok
Server4	Visible	No data	Problem

Initial Hypothesis

Whenever PMM shows partial or missing data, there are usually a few usual suspects:

PMM Agent not running properly
Exporter stopped or stale metrics
Prometheus scrape failure
Network delay or firewall issue
Time synchronization mismatch

Since PMM was showing 24-hour data, we could immediately rule out exporters and scraping.
That left us with one interesting clue: time.

Checking the Time

On the problematic server (Server4):

[admin@sgpserver4 ~]$ date
Thu Jul 18 05:38:18 AM IST 2024

[admin@sgpserver4 ~]$ timedatectl status
               Local time: Thu 2024-07-18 11:40:48 IST
           Universal time: Thu 2024-07-18 06:10:48 UTC
                 RTC time: Mon 2001-01-01 14:40:52
                Time zone: Asia/Kolkata (IST, +0530)
System clock synchronized: no
              NTP service: active
          RTC in local TZ: no

That “System clock synchronized: no” caught our attention — and the RTC time from 2001 was a dead giveaway.

On another healthy server (Server2):

[admin@sgpserver2 ~]$ date
Tue Oct 28 09:11:30 AM IST 2025

Server4 was almost 4 hours behind compared to other servers. That explained why 24-hour graphs worked: the metric timestamps were still within Prometheus retention windows. But when selecting the last few hours, PMM’s query window didn’t match the drifted timestamps — hence, “No data”.

Root Cause: System Time Drift

Time drift occurs when a system’s internal clock runs out of sync with real-world time — usually because the NTP service isn’t active or functioning properly.

In this case:

The system clock was out of sync by ~4 hours.
PMM’s Prometheus recorded metrics with incorrect timestamps.
Grafana queries based on current time couldn’t find any data within the selected time range.

Thus, PMM wasn’t broken — it was just looking in the wrong time window.

The Fix: Realigning Time on Server4

To fix the time drift, we followed a safe and reversible approach.

Step 1: Disable automatic NTP sync temporarily

sudo timedatectl set-ntp false

Step 2: Manually set the correct time (matching other servers)

sudo timedatectl set-time "2025-10-28 09:11:30"

Step 3: Re-enable NTP service

sudo timedatectl set-ntp true

Step 4: Sync system time to hardware clock

sudo hwclock --systohc

Step 5: Restart PMM Agent

sudo systemctl restart pmm-agent

Step 6: Verify

timedatectl status
date

Output:

Local time: Tue 2025-10-28 09:11:30 IST
System clock synchronized: yes
NTP service: active

Verification in PMM

After correction:

Time Range	Data Visibility
Last 24 hours	Visible
Last 12 hours	Visible
Last 6 hours	Visible
Last 1 hour	Visible

We monitored for a few hours to confirm consistency — everything was stable.

Why Time Sync Is Critical in PMM (and Monitoring in General)

Time synchronization isn’t just a nice-to-have; it’s the backbone of distributed observability.

Here’s why:

Prometheus Relies on Accurate Timestamps:
Each metric is stored with a timestamp. If one host’s clock drifts, metrics appear “out of order” or “in the future/past,” breaking visual continuity.
Grafana Query Windows Depend on Current Time:
Dashboards query “last X hours” relative to now. If your system’s “now” is wrong, your data disappears.
Alert Rules May Misfire:
Alerts using rate() or increase() functions assume monotonic timestamps. A time drift can lead to false positives or missed alerts.
Cluster Coordination Breaks:
In replication, distributed locking, or orchestrators — inconsistent time can cause failover delays, wrong transaction ordering, or stale metrics.

How to Prevent Future Time Drift

Enable NTP Across All Servers

sudo timedatectl set-ntp true

Or install Chrony (recommended for servers):

sudo yum install chrony -y
sudo systemctl enable --now chronyd
chronyc sources -v

Standardize Timezone

Set all DB servers to a consistent timezone (e.g., IST):

sudo timedatectl set-timezone Asia/Kolkata

Verify with Automation

Create a daily cron or Ansible check:

timedatectl status | grep "System clock synchronized"

Alert if “no”.

Add Prometheus Alerts

Add an alert rule like:

alert: TimeNotSynced
expr: node_timex_sync_status == 0
for: 10m
labels:
  severity: warning
annotations:
  summary: "System time is not synchronized"
  description: "Host {{ $labels.instance }} has NTP sync disabled"

Include Time Drift in Health Checks

During any DB or PMM troubleshooting, always check time first.
A 10-second command can save hours of debugging.

Lessons Learned

Symptom: PMM “No Data” for short time ranges
Root Cause: Server’s clock 4 hours behind due to NTP desync
Fix: Manual time correction + re-enable NTP
Prevention: Standardize NTP and time checks across all servers

Final Checklist

Command	Purpose
timedatectl status	Verify NTP and sync state
sudo timedatectl set-ntp true	Enable NTP
sudo hwclock –systohc	Sync hardware clock
date	Confirm correct time
systemctl restart pmm-agent	Restart PMM agent
PMM → Dashboard	Confirm data visibility

Conclusion

Monitoring systems like PMM, Prometheus, and Grafana are incredibly powerful — but they rely on one universal truth: time must be right.

This blog is based on a real-world issue faced during MySQL infrastructure monitoring at scale. If you’re working with PMM, Orchestrator, or any Prometheus-based observability stack — remember that the smallest configuration (like a few minutes of clock drift) can have the biggest impact.

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

From “No Data” to Full Visibility: Debugging PMM Metrics

Introduction

The Setup: PMM for MySQL Monitoring

The Symptom

Initial Hypothesis

Checking the Time

Root Cause: System Time Drift

The Fix: Realigning Time on Server4

Verification in PMM

Why Time Sync Is Critical in PMM (and Monitoring in General)

How to Prevent Future Time Drift

Enable NTP Across All Servers

Standardize Timezone

Verify with Automation

Add Prometheus Alerts

Include Time Drift in Health Checks

Lessons Learned

Final Checklist

Conclusion

Like this:

Related

Leave a ReplyCancel reply

Latest to read

EXPERT DATABASE SUPPORT PARTNER

From “No Data” to Full Visibility: Debugging PMM Metrics

Introduction

The Setup: PMM for MySQL Monitoring

The Symptom

Initial Hypothesis

Checking the Time

Root Cause: System Time Drift

The Fix: Realigning Time on Server4

Verification in PMM

Why Time Sync Is Critical in PMM (and Monitoring in General)

How to Prevent Future Time Drift

Enable NTP Across All Servers

Standardize Timezone

Verify with Automation

Add Prometheus Alerts

Include Time Drift in Health Checks

Lessons Learned

Final Checklist

Conclusion

Share this:

Like this:

Related

Leave a ReplyCancel reply

Latest to read

Discover more from Genexdbs