Learn how to scale your Open edX platform by migrating the MySQL database service to its own remote Ubuntu server running on AWS EC2.

Summary

If you’re looking for a general overview of how to scale the Open edX platform then you should read this first, “Scaling Open edX“. If you’re planning a completely new Open edX installation in which you are not migrating data and that needs to run at scale, then you might want to take a look at the automated tools included in Cookiecutter Open edX Devops as well as this article on Managing Your Open edX Backend With Terraform.

The Open edX platform persists data across four distinct subsystems: MySQL, MongoDB, the Ubuntu file system, and Memcached. To scale the Open edX platform you first must physically separate the executable program code from the data that is managed by these four subsystems. This article explains how to migrate the MySQL data. By scaling, I mean migrating Open edX’s local MySQL service and data to its own independent Ubuntu EC2 instance.

Over the years I’ve experimented with various strategies for migrating Open edX’s MySQL data. This article describes my preferred approach, which involves using AWS EC2 console tools to clone your existing single-server installation of Open edX, and then to pare the clone down to only the MySQL system files and data. This approach is simple, fast and reliable. A key rationale for this approach is that the internal Open edX db schemas are both complex and fragile. Open edX maintains its user data in MySQL across multiple logical databases, each of which are accessed by multiple MySQL system user accounts of varying permissions levels. Some of this is taken care of automatically with MySQL’s native dump and restore commands, but a lot of it is not. But at any rate, working with a clone of your existing Open edX instance renders the point moot.

Regarding horizontal Vs vertical scaling strategies: note that you probably only need to focus on vertical scaling for MySQL on Open edX. Even though MySQL implements a publisher-subscriber model, I’ve found this to be complete overkill for all edX platforms on which I’ve ever worked. You can reasonably begin to scale MySQL on a remote t2.medium EC2 instance as further described below, increasing the size of the server over time on an as-needed basis.

I. Snapshot Your Open edX Server

1. Create an AMI from your Open edX instance

Follow the instructions in the AWS official documentation to create a clone (aka an “Amazon Machine Image, AMI”) of your existing Open edX instance, “Create an AMI from an Amazon EC2 Instance“. Keep in mind that you’re going to initialize your MySQL database from this AMI, so you need to ensure that no learners are interacting with your Open edX instance during the time that are migrating your MySQL data. You should conservatively estimate a 4-hour maintenance window for this activity. I generally perform this work during off hours.

II. Create a Remote MySQL Server

1. Launch a new EC2 instance from the AMI

Launch a new t2.medium EC2 instance from the AMI that you created. I re-use the existing SSH key that i used for the original Open edX EC2 instance. Note that you’d only need a different SSH key if for example, completely different teams manage the Open edX and MySQL environments.

2. Create a new EC2 Security Group for MySQL

You should create a separate EC2 Security Group for your new MySQL EC2 instance, as follows:

This firewall configuration limits remote access of the server to MySQL, regardless of whatever other services might still be installed and running internally on the server. Note that on the first row, SSH, you should try to limit access to your bastion server, if you use one.

3. Take Note of The Internal IP Address That is Assigned

You’ll access your new remote MySQL server via the internal IP address, which is automatically assigned by AWS when you create the new EC2 instance. Take note of this value, which will be titled, “Private IPv4 addresses”

4. Allow Remote MySQL Connections

Initially, you will not be able to connect to mysql remotely via the internal IP address because the native installation of Open edX disables remote MySQL database connections in order to improve security. To allow your Open edX instance to be able to connect to the new MySQL server remotely via port 3306 you’ll need to enable remote connections, which is a 2-step process, both of which are performed on your new MySQL remote server. First, on your new MySQL remote server, you need to modify the following configuration item located in /etc/mysql/mysql.conf.d/mysqld.cnf on or around row 43:

# Comment out the following item
bind-address           = 127.0.0.1
# by preceding 'bind-address' with a '#' character as follows
#bind-address           = 127.0.0.1

Modifications to mysqld.cnf require a restart of themysqld service

sudo service mysql restart

Second, you need to modify the existing MySQL user accounts. The native installation also limits the scope/visibility of all MySQL users to ‘localhost’, and so you’ll need to update the database table mysql.user to expand the allowed hosts from ‘localhosts’ to ‘%’ (which means ‘any’ host).

# Note: beginning with Koa, the root mysql password is preset
# and as of this writing i have not located where this password is set. As a workaround,
# you can alternatively use the command 
#            mysql -u admin -p
# where the password value for 'admin' is stored in /home/ubuntu/my-passwords.yml
# under the yml key tag, "COMMON_MYSQL_ADMIN_PASS" on or around row 22.
mysql -u root -p
mysql> UPDATE mysql.user SET host = '%';
mysql> FLUSH PRIVILEGES;

# NOTE: (February 2021) I had an isolated case where I lost all of the 
# user-level permissions after running this UPDATE query. If this happens 
# then you have two choices: 
# a.) use the mysql command line in your original EC2 instance to generate
#     a script of the user-level permissions, or
# b.) start the entire process from the beginning, and then use a tool like
#     MySQL Workbench (https://www.mysql.com/products/workbench/) to
#     manually update the host setting for each user.
# results should look like the following:
mysql> select host, user from mysql.user;
+------+------------------+
| host | user             |
+------+------------------+
| %    | admin            |
| %    | api001           |
| %    | debian-sys-maint |
| %    | discov001        |
| %    | ecomm001         |
| %    | edxapp001        |
| %    | edxapp_cmsh001   |
| %    | migrate          |
| %    | read_only        |
| %    | reports001       |
| %    | root             |
| %    | rosencrantz      |
| %    | xqueue001        |
+------+------------------+
13 rows in set (0.01 sec)
mysql> exit

More detailed instructions are available here, “MySQL: How to allow remote connection to mysql

At this point you should be able to connect to your new MySQL remote server from the command line of your Open edX instance by logging in to MySQL as follows:

# execute this command from the command lin