Introduction — When Healthy Pods Still Cause Slow Databases

Modern database platforms are no longer limited to simple query processing. Systems built around MySQL, PostgreSQL, and MongoDB increasingly rely on background workers for replication, change data capture (CDC), asynchronous processing, and data synchronization.

While running these database services on Kubernetes, we observed an unexpected situation:

  • Processing delays were increasing
  • Event backlogs kept growing
  • Autoscaling was enabled
  • Pods appeared perfectly healthy

CPU utilization stayed around 30%, yet database lag continued rising.

Understanding Database Workloads

Before discussing autoscaling, it is important to understand how database workloads behave.

Unlike traditional web applications, database consumers are rarely CPU-intensive.

Typical database worker activities include:

  • Reading replication logs
  • Processing binlog/WAL events
  • Waiting on disk operations
  • Calling downstream APIs
  • Performing asynchronous writes

These tasks are mostly I/O-bound, meaning the application spends more time waiting than computing.

Real Database Autoscaling Challenges

Modern database platforms introduce several scaling challenges that are often invisible at the infrastructure level.

Common scenarios include:

  • replication workers waiting for disk synchronization
  • slow downstream microservices delaying writes
  • analytics pipelines consuming database change events
  • batch updates triggered during peak business hours

In many cases, the database itself is not overloaded. Instead, the bottleneck exists in the processing layer consuming database events.

This creates a misleading situation where:

  • CPU metrics appear healthy
  • memory consumption remains stable
  • yet user-facing latency increases

Autoscaling decisions based purely on infrastructure metrics fail to capture these operational realities.

CPU-Bound vs I/O-Bound Workloads

CPU-BOUND APPLICATIONSDATABASE PROCESSING WORKERS
Heavy computationWaiting on I/O
High CPU usageLow CPU usage
Easy autoscalingHard workload detection

Traditional Autoscaling in Kubernetes

Kubernetes provides automatic scaling through the Horizontal Pod Autoscaler (HPA).

HPA adjusts pod counts based on infrastructure metrics such as:

  • CPU utilization
  • Memory consumption

Typical scaling logic:

If CPU usage increases → add pods
If CPU usage is normal → keep pods unchanged

This approach works extremely well for:

  • Web APIs
  • Stateless services
  • Compute-heavy workloads

However, database event processors behave differently.

Why CPU Metrics Mislead Platform Teams

CPU utilization became the default autoscaling signal because early cloud applications were compute-heavy. However, modern distributed systems operate differently.

Database consumers typically perform:

  • network-bound operations
  • transactional commits
  • external API calls
  • disk synchronization

During these operations, containers remain mostly idle from a CPU perspective while still actively processing work.

As a result, CPU usage reflects resource consumption, not system pressure.

This mismatch leads to delayed scaling reactions and growing operational risk during traffic spikes.

Why Autoscaling Failed for Database Processing

In our database services environment:

  • Event generation increased rapidly
  • Replication/change events accumulated
  • Processing delays increased

Yet Kubernetes did not scale additional pods.

Why?

Because CPU usage remained low.

The autoscaler assumed everything was functioning normally while the actual workload pressure existed inside the event backlog.

Root Cause

Autoscaling was tied to resource consumption, not workload demand.

Observability: The Missing Piece in Autoscaling

Successful autoscaling requires visibility into application behavior rather than infrastructure alone.

Database platforms benefit from monitoring signals such as:

  • replication lag duration
  • pending background jobs
  • transaction commit latency
  • change event throughput

These metrics reveal the true workload demand.

By integrating observability platforms such as Prometheus, teams can expose meaningful database metrics that represent real processing pressure.

Autoscaling then becomes proactive rather than reactive.

Event-Driven Autoscaling — A Better Approach

To solve this, we shifted from resource-based scaling to event-driven autoscaling using KEDA.

KEDA (Kubernetes Event-driven Autoscaling) is a lightweight, CNCF-graduated component that provides event-driven autoscaling for Kubernetes workloads. It enables applications to scale from zero to thousands of pods based on external metrics.

KEDA extends Kubernetes autoscaling by monitoring external events instead of CPU metrics.

Examples of scalable signals:

  • Queue length
  • Replication lag
  • Pending database jobs
  • Change stream events

Instead of asking “How busy is the CPU?”, we now ask:

“How much work is waiting to be processed?”

Implementation Approach

The implementation required minimal changes:

  1. Install KEDA in the cluster
  2. Define a scaling object connected to database metrics

Operational Benefits Beyond Scaling

Event-driven autoscaling introduced benefits beyond performance improvements.

Operational advantages included:

  • reduced manual intervention during peak load
  • predictable recovery from traffic bursts
  • improved database stability due to faster event processing
  • simplified capacity planning

Engineering teams no longer needed to guess peak resource requirements. The system dynamically adapted to workload demand, allowing platform teams to focus on reliability instead of reactive scaling actions.

Observed Improvements

After adopting event-driven autoscaling:

MetricBeforeAfter
Processing DelayHighNear real-time
Pod CountFixedDynamic
Resource UsageConstantDemand based
Idle InfrastructureAlways runningScales to zero

Database backlogs cleared faster, and processing became predictable even during traffic spikes.

Cost Optimization Benefits

Traditional autoscaling typically maintains minimum running pods regardless of workload activity.

Event-driven autoscaling introduced:

  • Automatic scale-to-zero capability
  • Reduced idle compute usage
  • Improved node utilization
  • Lower infrastructure costs

Autoscaling finally aligned with actual demand rather than static assumptions.

Engineering Lessons Learned

This experience highlighted several important platform engineering principles:

  • Autoscaling must follow the system bottleneck
  • CPU usage does not always represent real workload pressure
  • Database systems are inherently event-driven
  • Observability metrics are more valuable than raw resource metrics

When Database Autoscaling Should Be Avoided

While event-driven autoscaling is powerful, it is not universally applicable.

Database workloads that may not benefit include:

  • long-running analytical queries
  • stateful database engines themselves
  • tightly coupled legacy applications
  • workloads requiring strict connection limits

Autoscaling should primarily target database consumers and processing services, rather than the database engine itself.

Understanding this distinction prevents architectural complexity and ensures stable production environments.

Conclusion

As database platforms evolve toward distributed and event-driven architectures, traditional CPU-based autoscaling becomes insufficient.

By adopting event-driven autoscaling, database services running on Kubernetes become:

  • More responsive
  • More efficient
  • More cost-effective
  • Better prepared for modern workloads

Scaling should reflect work waiting to be processed, not just resources being consumed.

Discover more from Genexdbs

Subscribe now to keep reading and get access to the full archive.

Continue reading