MySQL slave to master switch is an operation required in a few situations. Sometimes master dies and you need to make DB available as soon as possible. In this case if the DB is reasonably big then restoring from backup might take to long. Also keep present that slave has a current copy of data, you might have new data after the last backup that you can not afford to lose. Some other common situations where you might require such a procedure is when you want to upgrade your MySQL servers. For instance from Percona 5.6 to Percona 5.7. Also this will help to avoid long downtime.
Notice: if you have classic master slave replication some downtime is unavoidable, there are different setups that can be done to achieve having a cluster that does not stop working if you take the master offline, but this is NOT the objective of this post.
this post is some what related to another recent post; Mysql DB server OS change / replacement or distro upgrade; study case
Ok, let’s stop the chitchat and let’s get some stuff done.
The procedure preparations
1- If you are running a website Enable Maintenance page. Can be achieved easily by replacing an index with a basic html page.
2- Check replication MySQL master to MySQL slave is working and there is no lag:
mysql> show slave status\G;
If all is good then you might proceed as this basically tells that you have a consistent MySQL slave DB or not. If there is a problem you will find delay or errors, this is a correct output with no errors or delays:
mysql> show slave status\G; *************************** 1. row *************************** Slave_IO_State: Waiting for master to send event Master_Host: 10.1.1.1 Master_User: replrs Master_Port: 5433 Connect_Retry: 60 Master_Log_File: mysql-bin.006558 Read_Master_Log_Pos: 158031192 Relay_Log_File: relay-bin.018258 Relay_Log_Pos: 66237315 Relay_Master_Log_File: mysql-bin.006558 Slave_IO_Running: Yes Slave_SQL_Running: Yes Replicate_Do_DB: Replicate_Ignore_DB: Replicate_Do_Table: Replicate_Ignore_Table: Replicate_Wild_Do_Table: Replicate_Wild_Ignore_Table: Last_Errno: 0 Last_Error: Skip_Counter: 0 Exec_Master_Log_Pos: 158031192 Relay_Log_Space: 158031591 Until_Condition: None Until_Log_File: Until_Log_Pos: 0 Master_SSL_Allowed: No Master_SSL_CA_File: Master_SSL_CA_Path: Master_SSL_Cert: Master_SSL_Cipher: Master_SSL_Key: Seconds_Behind_Master: 0 Master_SSL_Verify_Server_Cert: No Last_IO_Errno: 0 Last_IO_Error: Last_SQL_Errno: 0 Last_SQL_Error: Replicate_Ignore_Server_Ids: Master_Server_Id: 50 1 row in set (0.00 sec) ERROR: No query specified mysql>
If there is some problem you might need to solve that first.
Now let’s make a parenthesis let’s assume the following server details:
10.1.1.1 machine-master-db, this DB is writable MySQL master.
10.1.1.2 machine-slave-db, this db is readonly MySQL slave.
After the operation 10.1.1.2 machine-slave-db will be the master.
3- Enable firewall rules on both servers so other servers can not write to DB during the operation. For instance like the following on master:
/sbin/iptables -I INPUT 1 -p tcp --destination-port 3306 -j DROP /sbin/iptables -I INPUT 1 -p tcp -s 10.1.1.2 --destination-port 3306 -j ACCEPT
This enables the Current master to be unreachable by everyone but the current slave. this same operation
3.1- Make sure the MySQL connectivity is still working after operation:
[root@machine-master-db ~]# telnet 10.1.1.2 3306 Trying 10.1.1.2... Connected to 10.1.1.2. Escape character is '^]'. S xW(b6%TZ��F/XPf3g/Q,'}mysql_native_password^] telnet> quit Connection closed. [root@machine-slave-db ~]#
You should then repeat the same operation from MySQL slave to master and vise-versa. Then make sure the APP or website can not connect to DB.
4- (Optional) You need to change the APP to point to new DB (MySQL slave). the way to do this varies depending on your setup. E.g. if you are using DNS or host file to point your app to DB server then you need to make those changes in this step. Just to say you use hosts file to define where the DB is then you need to; Change host file on web servers and `service dns-cache restart` or/and maybe `service nscd restart` (can be done even later if wanted):
pfadmin@machine-web2:~$ cat /etc/hosts | grep db 10.1.1.2 machine-master-db #10.1.1.1 machine-master-db #Old master pfadmin@machine-web2:~$ sudo /etc/init.d/dns-clean restart
MySQL slave master; do the switch!
5- Make master readonly, this step is in case you still have a master working. if you only have MySQL slave working you can skip this step and the next:
mysql> set global read_only=ON; Query OK, 0 rows affected (0.00 sec) mysql> show variables like '%read_only%'; +------------------+-------+ | Variable_name | Value | +------------------+-------+ | innodb_read_only | OFF | | read_only | OFF | | super_read_only | OFF | | tx_read_only | OFF | +------------------+-------+ 4 rows in set (0.00 sec) mysql>
6- Flush everything to DB to get a clean database with a new starting point of bin-logs:
mysql> flush tables; flush logs;
7- make sure to record current master log position (DB1 at this point):
mysql> show master status; +------------------+----------+--------------+------------------+-------------------+ | File | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set | +------------------+----------+--------------+------------------+-------------------+ | mysql-bin.006863 | 4500| | | | +------------------+----------+--------------+------------------+-------------------+ 1 row in set (0.00 sec) mysql>
8- This is time to stop the slave:
slave mysql> stop slave;
9- Make slave writable:
slave mysql> set global read_only=OFF; Query OK, 0 rows affected (0.00 sec) mysql> show variables like '%read_only%'; +------------------+-------+ | Variable_name | Value | +------------------+-------+ | innodb_read_only | OFF | | read_only | OFF | | super_read_only | OFF | | tx_read_only | OFF | +------------------+-------+ 4 rows in set (0.00 sec) mysql>
10- We are done with works on slave, at this point we can open firewall and let the APP use the slave.
11- setup old master as slave:
mysql>CHANGE MASTER TO MASTER_HOST='10.1.1.2', MASTER_USER='repl', MASTER_PASSWORD='p4ssword', MASTER_LOG_FILE='MYSQL_MASTER_LOG_FILE', MASTER_LOG_POS=MYSQL_MASTER_LOG_POSITION;
12- Check replication:
mysql> show slave status\G *************************** 1. row *************************** Slave_IO_State: Waiting for master to send event Master_Host: 10.1.1.1 Master_User: repl Master_Port: 3306 Connect_Retry: 60 Master_Log_File: mysql-bin.000014 Read_Master_Log_Pos: 12255788 Relay_Log_File: relay-bin.000002 Relay_Log_Pos: 283 Relay_Master_Log_File: mysql-bin.000014 Slave_IO_Running: Yes Slave_SQL_Running: Yes (...) Seconds_Behind_Master: 0 (...)
The replication might take a while to catchup, as long as the numbers are going down you have nothing to worry. You might also need to change some MySQL configuration if you notice the replication lag does not go down or if there are errors. But there are many variables here and this goes out of this posts objective.
All done, lets make sure it all works
13- Test that everything works by performing a simple write on new master (DB2 in this case) and check if it replicates to new slave (DB1) e.g.:
mysql> create database test2;
14- restore firewall, back to production configs. if you do not have any system firewall normally something like:
iptables -F
14.1 Then you can run tests on the MySQL slave and master DB and check there is access from web servers or app.
15- Remove maintenance page if applies to you and DONE with the MySQL slave master switch.