When managing large-scale MySQL replication environments, ensuring data consistency between the master and replica nodes becomes critical. Percona Toolkit offers powerful tools to simplify replication maintenance, error handling, and consistency checks. In this blog, we’ll explore three essential Percona tools:
pt-slave-restartpt-table-checksumpt-table-sync
We’ll go through their purpose, useful options, and real-world examples.
1. pt-slave-restart: Automatically Skip Replication Errors
pt-slave-restart is a tool from Percona Toolkit that automatically restarts MySQL replication if it stops because of specific errors. It helps to skip known errors (like duplicate key errors: 1062), so replication doesn’t need to be fixed manually every time.
Common Use Case
- When replication stops due to a duplicate entry or can’t add or update a child row error.
- You want to skip a limited number of errors and keep replication running.
I need to check if the pt-tool is available in the OS.

If it’s not present as above, then use the below command and install it.
sudo apt-get install percona-toolkit -y —>> in ubuntu
I have two mysql servers, one master and one slave. To make it manually inconsistent, I truncated the repl_innodb table in the master and then inserted one record in the slave for the same table.

Currently, the data is in sync. However, if we insert the same record again on the master server, the slave will throw a duplicate entry error, as shown in the screenshot.

To resolve this error on the slave, we can use the pt-slave-restart tool. It skips the problematic event and restarts the SQL thread, allowing replication to continue.
Note: Since I’m using a binary-based MySQL installation, it’s important to explicitly specify the socket file in the command when executing it.
pt-slave-restart –user=replica –password=replica@123 –socket=/usr/local/MYSQL2/mysqll.sock –error-numbers=1062 –verbose

As shown in the screenshot above, the error is skipped, and both replication threads remain connected.

Things to Watch
- Don’t skip blindly — make sure you understand the root cause.
- Not ideal for GTID-based replication, where skipping may require more careful handling.
- Could lead to data inconsistency if used without checksum verification.
2. pt-table-checksum: Detect Data Drift Across Replication
Data drift (silent data mismatch) can occur over time due to hardware issues, application bugs, or manual updates on replicas. pt-table-checksum verifies replication integrity by calculating checksums of tables on the primary (source) server and comparing them to replica servers.
Common Use Case
- Periodic verification of data consistency in replication environments.
- Scheduled checksum jobs to detect data drift early.
In the screenshot below, the table in the test database on the master (left side) contains 3 records. On the replica (right side), one record was intentionally deleted to simulate inconsistency.

It’s easy to check things manually when there are just a few records. But if we’re dealing with a huge amount of data, pt-table-checksum makes the job much easier and faster.
First, we need to create one dsns table in the master. I found the dns table structure in the percona official website
CREATE TABLE `dsns` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`parent_id` int(11) DEFAULT NULL,
`dsn` varchar(255) NOT NULL,
PRIMARY KEY (`id`)
);
When you’re using Percona Toolkit tools, pt-table-checksum you need a way to discover the replica servers from the master.
By default, it uses SHOW SLAVE HOSTS but:
- In many setups, especially with MySQL or MariaDB, the
mysql.slave_master_infotable isn’t always set up properly. - Some MySQL versions (especially with GTID or in cloud environments) don’t support it well.
So, instead, Percona recommends using:
–recursion-method=dsn=D=percona,t=dsns
Then, you insert your replica connection info in the dsns column by mentioning slave host details. We can add multiple replicas here by adding more rows.
Insert into dsns values(1, NULL, ‘h=192.168.1.2’);
FYR– https://docs.percona.com/percona-toolkit/pt-table-checksum.html
Note: Make sure the master & slave are connected, then we need to execute this below command to check the difference of the table repl_innodb between master & slave.
pt-table-checksum h=192.168.1.2 –port=3310 –user=replica –password=replica@123 –no-check-binlog-format –nocheck-replication-filters –replicate percona.checksums1 –recursion-method=dsn=D=percona,t=dsns –max-load Threads_running=10 –max-lag=5 –databases test
As you see in the screenshot below, it executed successfully:

| Option | Explanation |
| h=192.168.1.2 | Host (your replica or master — this is where checksum starts). |
| –port=3310 | Port of the MySQL server. |
| Wait up to 5s if replicas are lagging. | Authentication for connecting to MySQL. |
| –no-check-binlog-format | Skips checking if binlog format is ROW (which is required for consistency). |
| –no-check-replication-filters | Skips checking for any replication filters (–replicate-do-db, etc.). |
| –replicate=percona.checksums1 | The table where checksum results will be stored. |
| –recursion-method=dsn=D=percona,t=dsns | Tells pt to read the list of replicas from your custom dsns table in percona DB. |
| –max-load=Threads_running=10 | Don’t run if MySQL load is high (for safety). |
| –max-lag=5 | Wait up to 5s if replicas are lagging behind. |
| –databases=test | Wait up to 5s if replicas are lagging. |
In the master, the query shows the actual data, as seen in the screenshot below:

However, on the slave, the result looks different:

this_cnt / master_cnt: Row count on replica and master, respectively.
If a data mismatch exists between the master and slave, then we need to use pt-table-sync to fix it.
Things to Watch
- Can slow down your server if you run it during peak hours — always try it when traffic is low.
- It won’t work well if replication filters are enabled or if you’re using certain GTID-based setups — double-check your config.
- Make sure your binary log format is set to ROW, or it might not detect differences correctly.
- The table structure must be the same on both the master and the replica, or it will throw errors.
3. pt-table-sync: Repair Data Inconsistencies
pt-table-sync . This resolves data inconsistencies between MySQL servers by generating and optionally executing SQL statements that synchronize mismatched rows. It is typically used after running pt-table-checksum.
Common Use Case
- Automatically synchronize inconsistent rows.
- Generate SQL statements to review and apply manually.
As we discussed earlier, the repl_innodb table has 3 records on the master but only 2 on the slave.

Before running, pt-table-sync we can use the --print option to preview the SQL queries that will be executed to sync the data.
pt-table-sync –replicate=percona.checksums1 –sync-to-master h=192.168.1.2 –port=3311 –database test –verbose –print –user=’replica’ –password=’replica@123′
FYR, you can see the below screenshots:

To implement it, we just need to use –execute instead of —print.
pt-table-sync –replicate=percona.checksums1 –sync-to-master h=192.168.1.2 –port=3311 –database test –verbose –execute –user=’replica’ –password=’replica@123′

After successfully executing the above command, you can see in the screenshot below that the records on the slave have been synced to match the master.

Things to Watch
- DO NOT use blindly in bidirectional or multi-source replication setups.
- Ensure the replica is read-only during sync to avoid conflicts.
- Be careful with –execute; wrong config can overwrite good data.
Conclusion
Consistency is key in replication, and Percona Toolkit gives us powerful tools to monitor, detect, and resolve drift and replication errors. Here’s a quick summary:
| Tool | Purpose |
| pt-slave-restart | Skips and recovers from replication errors automatically. |
| pt-table-checksum | Detects data drift between master and replicas. |
| pt-table-sync | Fixes data drift by syncing rows. |
Use these tools wisely to maintain a healthy and consistent MySQL replication setup.