Is Scaling/Autoscaling a blessing or curse ?

There might be situations where we might have over provision resources ahead of time while capacity planning or resource planning in the initial stages of any application where we really can’t anticipate how much traffic it might get when it gores live during its peak usage. What we have seen in such cases where people have misjudged the anticipated data growth and over provision the instance in cloud. Then what happens is they are not able to get a maintenance window to downsize the instance or the complexity around the downtime or reclaiming procedure of the cloud resources and they end up paying huge bills.

We came across a similar situation and thought it would be useful to share our learning experience so it might help fellow cloud users for their migration planing etc. We came across one such requirement to upgrade MySQL from 5.7 to 8.0 with of course minimum downtime as usual a critical use case of can’t afford downtime and it was 24/7 busy application. He had a close to 1 TB data on a RDS instance as below.

Inplace AWS RDS 5.7 to 8.0 migration

We could just do Inplace upgrade from mysql 5.7 to 8.0 like on onprem servers in RDS as detailed below:

  1. Verify the instance compatibility using mysql upgrade checker
  2. Fix the incompatibilities at application level and at Database level
  3. Take a snapshot of existing 5.7 RDS instance
  4. Restore the snapshot and bring up the same version instance
  5. Create a new 8.0 parameter group from the new 8.0 instance
    • Compare the 8.0 parameter group with the existing 5.7 prod parameter group.
    • Identify the changes between both the parameter groups and adjust the parameters with values from 5.7 parameter group or do tuning accordingly
  6. Once the new instance becomes available we can modify the instance with the desired version
  7. At last the new instance would go for a reboot and come up with the targeted mysql version i.e 8.0.28

NOTE: The above mentioned procedure is not a recommended procedure as the recommended procedure would be to take a traditional backup and restore method but this procedure also works.

From the Initial planning our draft architecture came something like this but it had some critical blockers.

RDS Instance BEFORE Migration

MySQL Version - 5.7.38

16 Core | 64 GB Memory | 1600 GB Disk 

RDS Instance AFTER Migration

MySQL Version - 8.0.28
16 Core | 64 GB Memory | 1600 GB Disk 

Application level Blockers

The migration could have been done from database stand point however customer had few challenges, one of them was there were many micro services written in different languages connecting to the same instance, hence

  • He wasn’t sure if his application would work end to end after taking a full cutover to RDS 8.0 instance.
  • He wasn’t sure if one cutover would be enough to migrate all the applications from 5.7 instance to 8.0 without having any performance impact

Complicated Database requirements

Because of the given above challenges customer came up with a work around as below which would have made this migration go for weeks.

  • Could we migrate one schema at a time using mysqldump and take cutover per schema/application?
  • They wanted a DBA to work along with app team for whole day till the migration and cutover is complete.
  • This approach needed more than one downtime per application per database each day which was a problem too.

This was neither cost effective for the customer not time efficient from our end.

Proposed Solution

Then with the all above requirements we had to come up with a solution which would be cost effective for our customer and efficient from our end to implement and deliver. So we came up something like this.

Complete database architecture for migration

  1. Steps 1 – to – step 7 would remain the same as planned before but now we have additional action items.
  2. with setting up 8.0 RDS instance for cutover basically DBA task was done
  3. Now customer had to take cutover per application and stop writing to 5.7 instance and start using 8.0 instance.
    • Here came another set of challenge while block writes on 5.7 instance on the database that have been cutover to 8.0 RDS
    • Same database user was used across different application and it was given *.* grants on all schemas.
    • We couldn’t restrict based on user access as it had to do with modifying the grants of the app user which was connecting to database continuously
    • Due to restriction of configuring replication based filter we couldn’t block databases at RDS Replication configuration level as well
    • So in short we cannot stop data flow between 5.7 RDS to 8.0 RDS.
  4. So here we introduced an intermediate EC2 8.0 instance to be built using mysqldump/mydumper
    • NOTE: Have not included details on how we got the EC2 instance between 2 RDS instances as its pretty straight forward replication configuration
  5. While we were working on the timeline changes customer initiated a data cleanup activity from their end so it would help us in traditional backup and restore.
  6. With almost 100-150 GB archived and deleted in the Prod 5.7 instance 900GB database came down to 750 GB
  7. 750GB database further came down to 650 GB after restoring the backup which helped us reclaiming 100 GB more of a disk space.
  8. Finally when 900GB data became 650 GB database we suggested to reduce the instance size from 1600 GB disk to 800 GB Disk which would save him close 100$ per month recurring.
  9. Customer was super excited to get down to reduce the disk size of the RDS instance further and save some dollars 🙂
    • NOTE: However saving dollars was not that easy to achieve 🙂 RDS or any cloud solutions allows you to scale down on CPU and memory but doesn’t allow you can scale down the unused disk
  10. Now we had to add one final step to rebuild the 8.0 RDS instance using mysqldump/mydumper which is the only way to restore the data directly on a 800GB disk instance otherwise snapshot based restores would always create a same instance type.
  11. After the targeted RDS instance resource downgrade the setup was ready in the below architecture for final cutover.

Final Cutover Schedule

Once the setup was ready from our end the rest was set for the application to cutover to MySQL 8.0 RDS instance. we created a small script from our end to simplify our task job of adding replication filter on the EC2 instance. The application team took 4-5 days to do cutover to new 8.0 instance but we didn’t encounter any issues both at database end and application end. After the complete cutover we got rid of the EC2 instance after 1 day and 5.7 RDS instance after couple of days.

#!/bin/bash
ignore_database=$1
if [ $# != 1 ] ; then
        echo -e "Call the script as below : \n"
        echo -e "$0 ignore_database"
        exit 1;
fi
mysqlconnect="/usr/local/percona/bin/mysql --login-path=percona -S /var/lib/percona2/mysql.sock"
fetch_ignore_filter=`$mysqlconnect -e "show replica status\G" | grep Replicate_Wild_Ignore_Table | awk '{print $2}'`
if [ -z ${fetch_ignore_filter} ]; then
	new_fetch_ignore_filter="${ignore_database}.%"
else
	new_fetch_ignore_filter="${fetch_ignore_filter},${ignore_database}.%"
	filters=`echo $new_fetch_ignore_filter | sed "s/,/\',\'/g" | sed "s/$/'/g" | sed "s/^/'/g"`
	echo $filters

fi
#exit
change_filter="CHANGE REPLICATION FILTER Replicate_Wild_Ignore_Table=(${filters})"
$mysqlconnect -e "stop replica;${change_filter};start replica;"
$mysqlconnect -e "show replica status\G

Script Calling procedure

[root@ip-172-31-3-190 ec2-user]# ./add_replication_filter.sh app_prod_archival
******** Replication Filter Before the change ********
  Replicate_Wild_Ignore_Table: mysql.%,airflow.%,cms.%,cmsanalytics.%,app_prod.%
******** Replication Filter After the change ********
  Replicate_Wild_Ignore_Table: mysql.%,airflow.%,cms.%,cmsanalytics.%,app_prod.%,app_prod_archival.%

Conclusion

So as you see it all started with upgrade from RDS MySQL upgrade from 5.7 to 8.0 but we ended up doing multiple activities like Identifying the objects for cleanup and archival, modify the size of the RDS instance Type etc.. The plan did got changed few times but we were able to accomplish more than what we wanted to achieve when we started. I would say taking the decision to migrate itself is the most important step in any migration. We recommend to migrate to mysql 8.0 if not then at least to latest mysql 5.7. Instances in 5.6 , 5.5 and older are really old to be on.

We thought of sharing our case study may be it can help you decide to migrate or give an idea on how migration can be done if you have a similar use case. Hope you find this information useful. Keep reading and follow us for more exclusive content for open source databases. Feel free to drop us a note if you need any more specifics of the migration we would be happy to discuss and share our learnings.

%d bloggers like this: