What is MySQL replication ?
The definition of replication cannot be stated as better as it has been stated in the mysql doc as below.
Replication is a technology through which we can maintain a copy data from one source of MySQL server to one or more destination MySQL servers.
The Source is always referred as master in MySQL and the replicating servers are referred as slaves.This technology has been around since MySQL version 3.23 (the oldest version that I remember). We didn’t have bi-directional replication by then and so many major releases happened since then like 4.0,4.1,5.0,5.1,5.1,5.6,5.7 and now 8.0. Replication in MySQL is very simple and very straight forward to set up.
MySQL Replication is completely dependent on events happening on the master and how they have been handled in the binlogs. The sequence of replication defines the type of replication
- Asynchronous Replication
- Semi-Synchronous Replication
- Group Replication
- Synchronous Replication
What is Replication format ?
Replication format is mostly referred as the way the events gets stored in binlogs in master binary log which also gets apply to the slave and there could be replication issues which could actually break SQL_THREAD
a) Statement based Replication
b) Row based Replication
What do you need for replication ?
Well all you need is two servers and it is recommended to be completely identical in terms of hardware configuration in case if you expect any of them or both of them to be capable enough to handle application traffic at some point of time. I have these below servers which are completely identical as I mentioned with 4 CPU cores and 8GB RAM and I have mysql 5.7.17 installed in these servers.
I have included settings for enbling GTID as there are some good benefits for maintianing consistency within the mysql replication cluster and choosing the binlog and its position to start replication becomes really very easy with GTID enabled environment. But I am not giving any more infomation about its limitation and issues. You can just give it a try on your dev/test environments to begin with GTID.
10.1.1.1 – Master Database
10.1.1.2 – Slave Database
Master Prerequisite
- Both Master and Slave should have server-id enabled and they should be unique.
- Technically Binlog should be enabled mainly on Master but it should be enabled in slave as well for failover purposes and in case if you plan to have some sort of chain replication.
- A replication user must be created in master which the slave can use to connect to master to fetch the transactions from bin-log
- If you want to add a slave to an existsing master host then you need to perform one additional step in master and i.e taking the backups
Master /etc/my.cnf
server-id = 10111
log-bin = /var/log/mysql/mysql-bin
expire_logs_days = 10
log-bin-index = /var/log/mysql/bin-log.index
binlog_format = MIXEDgroup_replication_bootstrap_group=off
log_slave_updates = 1
master_info_repository = “TABLE”
sync_master_info = 1
Enable GTID if you prefer to have gtid. I generally have it in our environment as I would recommend it to be enabled if not already.
enforce_gtid_consistency=1
gtid_mode=ON
Replication User Creation
CREATE USER ‘replication_user’@’%’ IDENTIFIED BY ‘replication_password’;
GRANT REPLICATION SLAVE ON *.* TO
‘replication_user’@’%’
FLUSH PRIVILEGES;or
GRANT REPLICATION SLAVE ON *.* TO
‘replication_user’@’%’ IDENTIFIED BY
‘replication_password’;
FLUSH PRIVILEGES;
Backup of Master Instance
MysqlDump
mysqldump -u username -p password –all-databases –single-transaction –master-data=2 –triggers –routines –events > master.dump
Xtrabackup
innobackupex –user=username –password=password –defaults-file=/etc/my.cnf /backup/location –no-timestamp > innobackupex.log 2>&1
Slave Prerequisite
- Restore the backup taken from the master server in the above step
- As mentioned above Slave should have server-id enabled and should be different from what is assigned to the master in above section.
- Enable log_bin and log_slave_updates if needed replication.
- Enable relay log bin in slaves to store the bin log events from the master.
Backup Restore
MysqlDump Restore
copy the backup to the slave server and run the backup import locally
scp /backup/location/master.dump
OS_user@10.1.1.2:/restore/location
mysql -u username -p password
< /restore/location/master.dump Xtrabackup copy backups scp /backup/location/master.dump OS_user@10.1.1.2:/restore/location Prepare Backups innobackupex –use-memory=4G –apply-log / restore/location Restore Backups innobackupex –copy-back /restore/location
Slave /etc/my.cnf
server-id = 10112
relay-log = /var/log/mysql/mysql-relay-bin
relay-log-index = /var/log/mysql/mysql-relay-bin.index
relay_log_info_file = /var/log/mysql/relay-log.info
relay_log_info_repository = “TABLE”
sync_relay_info = 1
If this server is going to serve as failover slave then consider adding below parameters in the /etc/my.cnf file
log_slave_updates = 1
log-bin = /var/log/mysql/mysql-bin
log-bin-index = /var/log/mysql/bin-log.index
binlog_format = MIXED
Enable GTID if you have enabled it in master
enforce_gtid_consistency=1
gtid_mode=ON
NOTE :
Make sure /var/log/mysql/ directory is available and writeable by mysql I have seen is most of the cases when mysql doesn’t come up after making above changes its just the permission issues which we end up troubleshooting so its better to verify that before we proceed.
Now that you have made the necessary configuration change if this is a fresh setup perform a mysql restart in both master and slave because server-id is a system variable which cannot be enabled dynamically. So make sure we have a clean mysql restart.
At this stage both servers 10.1.1.1 and 10.1.1.2 can be master and slave respectively if we want them to be.
This is the most important step in the replication setup is letting the slave know that it is going to be a slave server.
Letting slave know
For regurlar asynchronous and semi-synchronous replication change master is as below
CHANGE MASTER TO
MASTER_HOST=’10.1.1.1′,
MASTER_PORT=3306,
MASTER_USER=’replication_user’,
MASTER_PASSWORD=’replication_password’,
MASTER_LOG_FILE=’mysql-bin.000001′,
MASTER_LOG_POS= 4;
For GTID Enabled Environment change master is as below
CHANGE MASTER TO
MASTER_HOST=’10.1.1.1′,
MASTER_PORT=3306,
MASTER_USER=’replication_user’,
MASTER_PASSWORD=’replication_password’,
MASTER_AUTO_POSITION=1;
What happens after change master is issued ?
- Server ( 10.1.1.2 ) gets to know that is going to a slave going forward for the master host (10.1.1.1 ) mentioned in the change master.
- What credentials to use to connect to that master server when it want to
- What position to start with to start replication
- There are lot many options as mentioned in mysql doc like SQL_delay/SSL settings/channel info etc.
Kick Start Replication ( Slave Server )
Start Slave;
There are few series of events that happens after replication gets started.
- Slave tries to make a connection to master server through IO thread
- Failure and Successful scenarios for IO thread
IO Thread failure
example :
Slave_IO_State : Reconnecting after a failed master event read Slave_IO_Running : Connecting Last_IO_Error : error reconnecting to master 'replication_user@localhost:3306' - retry-time: 60 retries: 1IO Thread Successful
Slave_IO_State : Waiting for master to send event Slave_IO_Running : Yes
- The master will also have something to monitor weather slave was successfully able to connect to the master.
Processlist in Master Before replication is started in slave
mysql> show processlist; +----+------+-----------+------+---------+------+----------+------------------+ | Id | User | Host | db | Command | Time | State | Info | +----+------+-----------+------+---------+------+----------+------------------+ | 3 | root | localhost | NULL | Query | 0 | starting | show processlist | +----+------+-----------+------+---------+------+----------+------------------+ 1 row in set (0.00 sec)
Processlist in Master After replication gets started in slave successfully
mysql> show processlist; +----+-----------+--------------------------------------+------+------------------+------+---------------------------------------------------------------+------------------+ | Id | User | Host | db | Command | Time | State | Info | +----+-----------+--------------------------------------+------+------------------+------+---------------------------------------------------------------+------------------+ | 4 | repl_user | sjc04s2glrdb03.teslamotors.com:47487 | NULL | Binlog Dump GTID | 114 | Master has sent all binlog to slave; waiting for more updates | NULL | | 5 | root | localhost | NULL | Query | 0 | starting | show processlist | +----+-----------+--------------------------------------+------+------------------+------+---------------------------------------------------------------+------------------+ 2 rows in set (0.00 sec)
- Once the IO thread connects to master successfully the SQL thread starts applying the content it recieves from master and it waits for more updates
mysql> show processlist; +-----+-------------+-----------+------+---------+--------+--------------------------------------------------------+------------------+ | Id | User | Host | db | Command | Time | State | Info | +-----+-------------+-----------+------+---------+--------+--------------------------------------------------------+------------------+ | 734 | system user | | NULL | Connect | 236145 | Waiting for master to send event | NULL | | 735 | system user | | NULL | Connect | 44 | Slave has read all relay log; waiting for more updates | NULL | +-----+-------------+-----------+------+---------+--------+--------------------------------------------------------+------------------+
- There are so many errors why SQL thread would fail which I would eloborate on the future posts in continuation to Replication.