If you’ve worked with MongoDB for a while, you’ve probably heard that “performance depends a lot on memory.” That’s not just a vague statement — it’s largely about how the WiredTiger cache behaves.
Instead of diving straight into jargon, let’s walk through this the way you’d explain it to a colleague during a system design discussion.
So, What Exactly Is WiredTiger?
Think of WiredTiger as the engine under MongoDB’s hood. It’s responsible for how data is stored, retrieved, and managed in memory and on disk.
WiredTiger is the default storage engine in MongoDB (since version 3.2). It is designed for high concurrency, compression, and efficient memory usage. One of its most important components is the WiredTiger cache, which directly impacts how quickly your application can read and write data.
But the most critical component here is its cache—because that’s where most of the action happens.
Why the Cache Matters More Than You Think
Imagine your application constantly reading user profiles or updating orders. If MongoDB had to fetch everything from disk every time, performance would degrade significantly.
That’s where the WiredTiger cache steps in.
It keeps frequently accessed data, indexes, and recently modified documents in memory so MongoDB can respond quickly without hitting the disk too often.
In simple terms:
More effective caching means faster queries and better throughput.
How Much Memory Does It Actually Use?
By default, MongoDB does not consume all available RAM.
WiredTiger cache size is typically calculated as:
(cacheSizeGB) ≈ (RAM – 1 GB) * 0.5
However, this is not a strict rule. The actual value can vary depending on MongoDB version, deployment configuration, and container memory limits.
For example, on a 16 GB machine:
(16 – 1) * 0.5 = 7.5 GB
So approximately 7.5 GB will be allocated to the WiredTiger cache.
The remaining memory is left for the operating system and filesystem cache, which is equally important for performance.
The Working Set (Why Memory Really Matters)
The working set is the portion of your data and indexes that your application actively uses.
Ideally, your working set should fit within the WiredTiger cache.
If it fits in memory, performance remains fast due to memory-based reads.
If it does not, MongoDB has to frequently read from disk, which leads to latency increases.
Even with proper indexing, performance will degrade if the working set exceeds available memory.
Most real-world performance issues come down to this one question:
Does your working set fit in memory?
What Happens Inside the Cache?
Not all data in the cache behaves the same way. WiredTiger treats it in two categories:
Clean Data
This is data that already exists on disk and hasn’t been modified.
It can be removed from cache relatively easily when space is needed.
Dirty Data
This is data that has been modified but not yet written to disk.
Before dirty data can be fully removed from cache, it must be written to disk—either during eviction or as part of a checkpoint.
You can think of dirty data as “work in progress.”
When the Cache Fills Up
The cache is not infinite. As it fills up, MongoDB triggers eviction.
WiredTiger continuously manages cache usage using background eviction threads.
Instead of waiting until memory is exhausted, it:
- Identifies less useful data
- Removes clean pages first
- Writes dirty data to disk as needed during eviction or checkpoint before freeing space
MongoDB also uses internal thresholds to control eviction behavior:
- eviction_dirty_target → preferred dirty data level (~5% by default)
- eviction_dirty_trigger → eviction becomes aggressive (~20% by default)
If dirty data grows beyond these thresholds, eviction pressure increases significantly.
If eviction falls behind, you may see:
- Increased latency
- Slow queries
- System pressure
Checkpoints: The Safety Net
WiredTiger periodically writes in-memory changes to disk using checkpoints.
Checkpoint timing is not strictly fixed. It depends on workload intensity, amount of dirty data, and internal thresholds. Under write-heavy workloads, checkpoints may occur more frequently.
Why this matters:
- It ensures data durability in case of failure
- It helps control the amount of dirty data
There is a trade-off:
- Too frequent → higher disk I/O
- Too infrequent → longer recovery times
Compression: A Quiet Performance Booster
WiredTiger uses compression (Snappy by default) to:
- Reduce memory usage
- Store more data in cache
- Reduce disk space
This comes with some CPU overhead, but in most environments, the trade-off is beneficial.
How Do You Know If Things Are Healthy?
You don’t have to guess. MongoDB exposes useful metrics through db.serverStatus().
When reviewing metrics, focus on what they indicate:
- Cache usage vs maximum size → Indicates memory pressure
- Dirty data percentage → Shows how much data is waiting to be written
- Pages read into cache → Indicates disk reads (higher means more cache misses)
- Eviction activity → Shows how actively cache is being managed
These metrics help determine whether the system is memory-efficient or disk-bound.
Monitoring WiredTiger Cache
To ensure optimal performance, monitoring should be continuous.
A few key metrics to pay attention to (from wiredTiger.cache):
- bytes currently in the cache
- maximum bytes configured
- tracked dirty bytes in the cache
- pages read into cache
- pages evicted from cache
Useful Derived Metrics
Dirty data percentage:
(tracked dirty bytes in the cache / maximum bytes configured) * 100
Cache usage percentage:
(bytes currently in the cache / maximum bytes configured) * 100
These give a clearer picture of cache pressure.
Common Issues and What They Mean
Let’s connect these behaviors to real-world performance problems.
Slow queries despite indexes
Often indicates that working data is not staying in memory. When the cache is too small for the workload, MongoDB reads from disk more frequently.
High disk activity
Usually a sign of cache inefficiency due to frequent cache misses or heavy write activity.
Sudden latency spikes
Often occur when eviction cannot keep up with workload demands.
Cache too small
Leads to frequent disk reads and higher latency. Indicates working set does not fit in memory.
Cache too large
Over-allocating memory to WiredTiger can reduce memory available to the OS, which may degrade overall performance.
High dirty data ratio
Dirty data requires disk writes before eviction, increasing write pressure.
Eviction falling behind
Leads to cache saturation and application slowdown. This can happen due to heavy workloads or slow disk performance.
Key insight:
Evicting clean data is relatively inexpensive, but evicting dirty data requires disk writes or checkpoint coordination, making it more resource-intensive.
OS Cache Still Matters
Even though WiredTiger manages its own cache, the operating system’s filesystem cache also plays a role.
Data read from disk may be cached by the OS, improving performance for repeated access.
Allocating too much memory to WiredTiger can reduce memory available to the OS cache, which may negatively impact performance.
Impact in Replica Sets
In replica sets, secondaries also use WiredTiger cache for replication (applying oplog entries).
If the cache is under pressure, replication lag can increase because secondaries may struggle to keep up with writes.
- Make sure your working set fits in memory (or as much as possible)
- Don’t allocate all RAM to MongoDB—your OS needs breathing room
- Use SSDs; they dramatically improve eviction and checkpoint performance
- Monitor regularly instead of reacting to outages
- Understand your workload: read-heavy vs write-heavy systems behave differently
At a glance, WiredTiger’s memory management might seem complex. But once you break it down, it’s really about balance:
- Memory vs disk
- Clean vs dirty data
- Performance vs durability
Get that balance right, and MongoDB performs beautifully.
Final Thought
If there’s one takeaway, it’s this:
MongoDB performance isn’t just about queries or indexes—it’s deeply tied to how well your cache is behaving.
Once you start thinking in terms of cache efficiency instead of just database size, you’ll troubleshoot faster and design better systems.