Introduction

IIn PostgreSQL, the Write-Ahead Logging (WAL) mechanism is a cornerstone of durability, crash recovery and high availability. But beyond just reliability, tuning WAL and understanding its internals can unlock major performance improvements.

In this blog series, we break down WAL into 4 major themes, from its core purpose to advanced tuning tips. If you are new to PostgreSQL internals, we recommend starting with our foundational blog

What is Write-Ahead Logging (WAL)?

PostgreSQL uses Write-Ahead Logging (WAL) to guarantee data integrity. The key principle: log first, write later. When a transaction modifies data, PostgreSQL writes a record describing that change to the WAL. Only after this log entry is safely stored does it apply the actual change to the database.

This enables crash recovery: if the system crashes, PostgreSQL can redo any changes using the WAL records.

Key Benefits

  • Crash recovery through redo logs
  • Fewer random disk writes, improving performance
  • Online backups and Point-in-Time Recovery (PITR)

Example

-- Step 1: Create test table
CREATE TABLE test_log (
  id SERIAL PRIMARY KEY,
  event_type TEXT,
  event_time TIMESTAMP DEFAULT now()
);

-- Step 2: Simulate a transaction
BEGIN;
INSERT INTO test_log (event_type) VALUES ('CRASH-RECOVERY-DEMO');
COMMIT;

-- Step 3: Verify insert
SELECT * FROM test_log;

This sequence shows how PostgreSQL logs the change before applying it. If a crash occurs after COMMIT, the WAL ensures the INSERT is redone on restart.

id |     event_type       |       event_time
----+----------------------+----------------------------
  1 | CRASH-RECOVERY-DEMO  | 2025-07-08 09:23:45.00123

Diagram:

Asynchronous Commit – Trading Safety for Speed

Asynchronous commit allows PostgreSQL to return success to the client before the WAL is flushed to disk, improving performance at the cost of possible recent data loss.

Use Case

Good for:

  • Event logging
  • Metrics ingestion

Avoid for:

  • Banking
  • Payment systems

SQL Example

-- Step 1: Turn off synchronous commit
SET synchronous_commit = OFF;

-- Step 2: Insert record
INSERT INTO test_log (event_type) VALUES ('ASYNC-COMMIT-TEST');

-- Step 3: Verify insert
SELECT * FROM test_log WHERE event_type = 'ASYNC-COMMIT-TEST';

This commit returns to the client immediately, but the change isn’t flushed to disk right away. If PostgreSQL crashes now, this record may be lost.

Output Before Crash:

id |    event_type        |       event_time
----+----------------------+----------------------------
  2 | ASYNC-COMMIT-TEST    | 2025-07-08 09:30:01.98765

If crash occurs immediately after insert:

-- Output after restart:
(0 rows)

WAL Configuration Tuning

PostgreSQL allows tuning of WAL and checkpoint parameters to balance durability and performance.

Key Parameters Explained

  • checkpoint_timeout: Time interval between automatic checkpoints.
  • max_wal_size: Maximum WAL size before triggering a checkpoint.
  • wal_buffers: Memory size allocated for WAL data.
  • checkpoint_completion_target: How evenly I/O is spread over time.
  • wal_keep_size: Minimum WAL retained for standby.

For Example

-- Check current checkpoint timeout value
SHOW checkpoint_timeout;

-- Change the timeout
SET checkpoint_timeout = '10min';

Increasing the timeout means checkpoints are less frequent, reducing disk I/O at the cost of slower crash recovery.

Manual Checkpoint

CHECKPOINT;

Forces PostgreSQL to flush all dirty pages and create a new checkpoint immediately. Useful for maintenance.

pg_stat Views

To monitor checkpoint and WAL statistics:

SELECT checkpoints_timed, checkpoints_req, checkpoint_write_time FROM pg_stat_bgwriter;

Explanation:

  • checkpoints_timed: triggered by timeout
  • checkpoints_req: triggered due to WAL size
  • checkpoint_write_time: time spent writing during checkpoints

Sample Output:

 checkpoints_timed | checkpoints_req | checkpoint_write_time 
-------------------+-----------------+------------------------
               15  |        3        |        12345 ms
SELECT wal_records, wal_fpi, wal_bytes FROM pg_stat_wal;

Explanation:

  • wal_records: number of WAL entries
  • wal_fpi: full page images written
  • wal_bytes: total size of WAL written

Sample Output:

 wal_records | wal_fpi | wal_bytes
-------------+----------+------------
    120045   |   1205   |  152673823

WAL Internals – Deep Mechanics

WAL uses binary log segments stored in pg_wal, with each change tracked by an LSN (Log Sequence Number).

How It Works

  • WAL files written sequentially, 16MB each
  • Each change gets a unique LSN
  • Recovery reads from last checkpoint LSN forward

Tracking WAL Progress

-- View current LSN
SELECT pg_current_wal_lsn();

Explanation: Shows the current WAL write location, useful for monitoring replication and recovery lag.

Sample Output:

 pg_current_wal_lsn 
---------------------
 0/1701B828
-- WAL gap from replication slot
SELECT pg_wal_lsn_diff(pg_current_wal_lsn(), restart_lsn) FROM pg_replication_slots;

Explanation: Shows how much WAL remains unprocessed by the standby/replica.

Output Example:

 pg_wal_lsn_diff 
------------------
    327680 bytes

Performance Tips

  • Place pg_wal on SSDs
  • Tune wal_buffers for high-write workloads

Diagram: WAL Flow

[Transaction COMMIT]
     ↓
[XLogInsertRecord] → [WAL Buffers in Memory] → [XLogFlush] → [pg_wal on Disk]

Conclusion: WAL Is Your Best Friend

PostgreSQL’s WAL system enables safe, high-performing databases. With smart tuning and an understanding of internals, you can:

  • Speed up writes
  • Reduce I/O spikes
  • Improve crash recovery time

WAL isn’t just for protection, it’s also a lever for performance and observability.


Discover more from Genexdbs

Subscribe now to keep reading and get access to the full archive.

Continue reading