What Really Powers MongoDB? A Look at WiredTiger Cache

If you’ve worked with MongoDB for a while, you’ve probably heard that “performance depends a lot on memory.” That’s not just a vague statement — it’s largely about how the WiredTiger cache behaves.

Instead of diving straight into jargon, let’s walk through this the way you’d explain it to a colleague during a system design discussion.

So, What Exactly Is WiredTiger?

Think of WiredTiger as the engine under MongoDB’s hood. It’s responsible for how data is stored, retrieved, and managed in memory and on disk.

WiredTiger is the default storage engine in MongoDB (since version 3.2). It is designed for high concurrency, compression, and efficient memory usage. One of its most important components is the WiredTiger cache, which directly impacts how quickly your application can read and write data.

But the most critical component here is its cache—because that’s where most of the action happens.

Why the Cache Matters More Than You Think

Imagine your application constantly reading user profiles or updating orders. If MongoDB had to fetch everything from disk every time, performance would degrade significantly.

That’s where the WiredTiger cache steps in.

It keeps frequently accessed data, indexes, and recently modified documents in memory so MongoDB can respond quickly without hitting the disk too often.

In simple terms:

More effective caching means faster queries and better throughput.

How Much Memory Does It Actually Use?

By default, MongoDB does not consume all available RAM.

WiredTiger cache size is typically calculated as:

(cacheSizeGB) ≈ (RAM – 1 GB) * 0.5

However, this is not a strict rule. The actual value can vary depending on MongoDB version, deployment configuration, and container memory limits.

For example, on a 16 GB machine:

(16 – 1) * 0.5 = 7.5 GB

So approximately 7.5 GB will be allocated to the WiredTiger cache.

The remaining memory is left for the operating system and filesystem cache, which is equally important for performance.

The Working Set (Why Memory Really Matters)

The working set is the portion of your data and indexes that your application actively uses.

Ideally, your working set should fit within the WiredTiger cache.

If it fits in memory, performance remains fast due to memory-based reads.
If it does not, MongoDB has to frequently read from disk, which leads to latency increases.

Even with proper indexing, performance will degrade if the working set exceeds available memory.

Most real-world performance issues come down to this one question:

Does your working set fit in memory?

What Happens Inside the Cache?

Not all data in the cache behaves the same way. WiredTiger treats it in two categories:

Clean Data

This is data that already exists on disk and hasn’t been modified.
It can be removed from cache relatively easily when space is needed.

Dirty Data

This is data that has been modified but not yet written to disk.

Before dirty data can be fully removed from cache, it must be written to disk—either during eviction or as part of a checkpoint.

You can think of dirty data as “work in progress.”

When the Cache Fills Up

The cache is not infinite. As it fills up, MongoDB triggers eviction.

WiredTiger continuously manages cache usage using background eviction threads.

Instead of waiting until memory is exhausted, it:

Identifies less useful data
Removes clean pages first
Writes dirty data to disk as needed during eviction or checkpoint before freeing space

MongoDB also uses internal thresholds to control eviction behavior:

eviction_dirty_target → preferred dirty data level (~5% by default)
eviction_dirty_trigger → eviction becomes aggressive (~20% by default)

If dirty data grows beyond these thresholds, eviction pressure increases significantly.

If eviction falls behind, you may see:

Increased latency
Slow queries
System pressure

Checkpoints: The Safety Net

WiredTiger periodically writes in-memory changes to disk using checkpoints.

Checkpoint timing is not strictly fixed. It depends on workload intensity, amount of dirty data, and internal thresholds. Under write-heavy workloads, checkpoints may occur more frequently.

Why this matters:

It ensures data durability in case of failure
It helps control the amount of dirty data

There is a trade-off:

Too frequent → higher disk I/O
Too infrequent → longer recovery times

Compression: A Quiet Performance Booster

WiredTiger uses compression (Snappy by default) to:

Reduce memory usage
Store more data in cache
Reduce disk space

This comes with some CPU overhead, but in most environments, the trade-off is beneficial.

How Do You Know If Things Are Healthy?

You don’t have to guess. MongoDB exposes useful metrics through db.serverStatus().

When reviewing metrics, focus on what they indicate:

Cache usage vs maximum size → Indicates memory pressure
Dirty data percentage → Shows how much data is waiting to be written
Pages read into cache → Indicates disk reads (higher means more cache misses)
Eviction activity → Shows how actively cache is being managed

These metrics help determine whether the system is memory-efficient or disk-bound.

Monitoring WiredTiger Cache

To ensure optimal performance, monitoring should be continuous.

A few key metrics to pay attention to (from wiredTiger.cache):

bytes currently in the cache
maximum bytes configured
tracked dirty bytes in the cache
pages read into cache
pages evicted from cache

Useful Derived Metrics

Dirty data percentage:

(tracked dirty bytes in the cache / maximum bytes configured) * 100

Cache usage percentage:

(bytes currently in the cache / maximum bytes configured) * 100

These give a clearer picture of cache pressure.

Common Issues and What They Mean

Let’s connect these behaviors to real-world performance problems.

Slow queries despite indexes

Often indicates that working data is not staying in memory. When the cache is too small for the workload, MongoDB reads from disk more frequently.

High disk activity

Usually a sign of cache inefficiency due to frequent cache misses or heavy write activity.

Sudden latency spikes

Often occur when eviction cannot keep up with workload demands.

Cache too small

Leads to frequent disk reads and higher latency. Indicates working set does not fit in memory.

Cache too large

Over-allocating memory to WiredTiger can reduce memory available to the OS, which may degrade overall performance.

High dirty data ratio

Dirty data requires disk writes before eviction, increasing write pressure.

Eviction falling behind

Leads to cache saturation and application slowdown. This can happen due to heavy workloads or slow disk performance.

Key insight:
Evicting clean data is relatively inexpensive, but evicting dirty data requires disk writes or checkpoint coordination, making it more resource-intensive.

OS Cache Still Matters

Even though WiredTiger manages its own cache, the operating system’s filesystem cache also plays a role.

Data read from disk may be cached by the OS, improving performance for repeated access.

Allocating too much memory to WiredTiger can reduce memory available to the OS cache, which may negatively impact performance.

Impact in Replica Sets

In replica sets, secondaries also use WiredTiger cache for replication (applying oplog entries).

If the cache is under pressure, replication lag can increase because secondaries may struggle to keep up with writes.

Make sure your working set fits in memory (or as much as possible)
Don’t allocate all RAM to MongoDB—your OS needs breathing room
Use SSDs; they dramatically improve eviction and checkpoint performance
Monitor regularly instead of reacting to outages
Understand your workload: read-heavy vs write-heavy systems behave differently

At a glance, WiredTiger’s memory management might seem complex. But once you break it down, it’s really about balance:

Memory vs disk
Clean vs dirty data
Performance vs durability

Get that balance right, and MongoDB performs beautifully.

Final Thought

If there’s one takeaway, it’s this:

MongoDB performance isn’t just about queries or indexes—it’s deeply tied to how well your cache is behaving.

Once you start thinking in terms of cache efficiency instead of just database size, you’ll troubleshoot faster and design better systems.

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

What Really Powers MongoDB? A Look at WiredTiger Cache

So, What Exactly Is WiredTiger?

Why the Cache Matters More Than You Think

How Much Memory Does It Actually Use?

The Working Set (Why Memory Really Matters)

What Happens Inside the Cache?

Clean Data

Dirty Data

When the Cache Fills Up

Checkpoints: The Safety Net

Compression: A Quiet Performance Booster

How Do You Know If Things Are Healthy?

Monitoring WiredTiger Cache

Useful Derived Metrics

Dirty data percentage:

Cache usage percentage:

Common Issues and What They Mean

Slow queries despite indexes

High disk activity

Sudden latency spikes

Cache too small

Cache too large

High dirty data ratio

Eviction falling behind

OS Cache Still Matters

Impact in Replica Sets

Final Thought

Like this:

Related

Leave a ReplyCancel reply

Latest to read

EXPERT DATABASE SUPPORT PARTNER

What Really Powers MongoDB? A Look at WiredTiger Cache

So, What Exactly Is WiredTiger?

Why the Cache Matters More Than You Think

How Much Memory Does It Actually Use?

The Working Set (Why Memory Really Matters)

What Happens Inside the Cache?

Clean Data

Dirty Data

When the Cache Fills Up

Checkpoints: The Safety Net

Compression: A Quiet Performance Booster

How Do You Know If Things Are Healthy?

Monitoring WiredTiger Cache

Useful Derived Metrics

Dirty data percentage:

Cache usage percentage:

Common Issues and What They Mean

Slow queries despite indexes

High disk activity

Sudden latency spikes

Cache too small

Cache too large

High dirty data ratio

Eviction falling behind

OS Cache Still Matters

Impact in Replica Sets

Final Thought

Share this:

Like this:

Related

Leave a ReplyCancel reply

Latest to read

Discover more from Genexdbs