Learn how to scale your Open edX platform by migrating the Memcached service to its own remote Ubuntu server running on AWS EC2.

Summary

If you’re looking for a general overview of how to scale the Open edX platform then you should read this first, “Scaling Open edX“.

The Open edX platform persists data across four distinct subsystems: MySQL, MongoDB, the Ubuntu file system, and Memcached. To scale the Open edX platform you first must physically separate the executable program code from the data that is managed by these four subsystems. This article explains how to migrate the Memcached service. By scaling, I mean migrating Open edX’s local Memcached service to its own independent Ubuntu EC2 instance.

This article describes my preferred approach, which is to use AWS EC2 console to create a new Ubuntu EC2 instance using the same version of Ubuntu on which your Open edX platform currently runs. As of this writing Open edX Koa runs on Ubuntu 20.04 LTS. I install the latest version of memcached, regardless of the version that the original Open edX server is using. When scaling Memcached I do not migrate data.

Note that you will only need to focus on vertical scaling for Memcached on Open edX. You can safely start with a t2.small or t2.medium EC2 instance size which will probably server your needs for the lifetime of your Open edX platform.

Warning: Do not attempt this procedure unless you consider yourself proficient with multiple disciplines including: System administration, Ubuntu Linux command-line, TCP/IP networking and network security, AWS EC2 services and networking tools, and basic Django configuration concepts.

I. Create a New EC2 Instance

1. Launch a new EC2 instance from the AMI

Launch a new t2.medium EC2 instance from the AMI that you created. I re-use the existing SSH key that i used for the original Open edX EC2 instance. Note that you’d only need a different SSH key if for example, completely different teams manage the Open edX and Memcached environments.

2. Create a new EC2 Security Group for Memcached

You should create a separate EC2 Security Group for your new Memcached EC2 instance, as follows:

This firewall configuration limits remote access of the server to Memcached, regardless of whatever other services might still be installed and running internally on the server. Note that on the first row, SSH, you should try to limit access to your bastion server, if you use one.

3. Take Note of The Internal IP Address That is Assigned

You’ll access your new remote Memcached server via the internal IP address, which is automatically assigned by AWS when you create the new EC2 instance. Take note of this value, which will be titled, “Private IPv4 addresses”

4. Install memcached service
# First, make sure that your local package index is updated
sudo apt update

# install the official memcached package 
sudo apt install memcached

# libmemcached-tools is a library that provides several tools to work with your Memcached server.
sudo apt install libmemcached-tools
5. Configure memcached to accept remote connections

Edit the file /etc/memcached.conf on or around row 35 as follows:

# Specify which IP address to listen on. The default is to listen on all IP addresses
# This parameter is one of the only security measures that memcached has, so make sure
# it's listening on a firewalled interface.
#
# change this from from '-l 127.0.0.1' to the following
-l 0.0.0.0

Afterwards you should reboot your new remote Memcached server

sudo reboot

You can read more about the installation procedure here, “How To Install and Secure Memcached on Ubuntu 18.04

II. Reconfigure Open edX to Connect to Your Remote Memcached Server

To identify which yml files require modification you can use the Linux command, ‘grep’ to search for the Linux port number corresponding to the service you are scaling. For example, Memcached is assigned to Linux port number 27017, thus we can execute the the following command to identify all Open edX configuration files that contain Memcached configuration parameters:

sudo grep -r '11211' /edx/etc/*.yml

We can see that in total there are four subsystems containing Memcached configuration parameters, and furthermore that two of these yml files contain multiple references to Memcached, as follows:

/edx/etc/ecommerce.yml:        - localhost:11211
/edx/etc/insights.yml:        - 127.0.0.1:11211
/edx/etc/lms.yml:        - localhost:11211
/edx/etc/lms.yml:        - localhost:11211
/edx/etc/lms.yml:        - localhost:11211
/edx/etc/lms.yml:        - localhost:11211
/edx/etc/lms.yml:        - localhost:11211
/edx/etc/lms.yml:        - localhost:11211
/edx/etc/lms.yml:        - localhost:11211
/edx/etc/studio.yml:        - localhost:11211
/edx/etc/studio.yml:        - localhost:11211
/edx/etc/studio.yml:        - localhost:11211
/edx/etc/studio.yml:        - localhost:11211
/edx/etc/studio.yml:        - localhost:11211
/edx/etc/studio.yml:        - localhost:11211
/edx/etc/studio.yml:        - localhost:11211

For example, the first occurrence of Memcached configuration for lms.yml looks similar to the following:

CACHES:
    celery:
        BACKEND: django.core.cache.backends.memcached.MemcachedCache
        KEY_FUNCTION: util.memcache.safe_key
        KEY_PREFIX: dev.celery
        LOCATION:
        - localhost:11211
        TIMEOUT: '7200'
    configuration:
        BACKEND: django.core.cache.backends.memcached.MemcachedCache
        KEY_FUNCTION: util.memcache.safe_key
        KEY_PREFIX: dev.roverbyopenstax
        LOCATION:
        - localhost:11211
    course_structure_cache:
        BACKEND: django.core.cache.backends.memcached.MemcachedCache
        KEY_FUNCTION: util.memcache.safe_key
        KEY_PREFIX: dev.course_structure
        LOCATION:
        - localhost:11211
        TIMEOUT: '7200'
    default:
        BACKEND: django.core.cache.backends.memcached.MemcachedCache
        KEY_FUNCTION: util.memcache.safe_key
        KEY_PREFIX: dev.default
        LOCATION:
        - localhost:11211
        VERSION: '1'
    general:
        BACKEND: django.core.cache.backends.memcached.MemcachedCache
        KEY_FUNCTION: util.memcache.safe_key
        KEY_PREFIX: dev.general
        LOCATION:
        - localhost:11211
    mongo_metadata_inheritance:
        BACKEND: django.core.cache.backends.memcached.MemcachedCache
        KEY_FUNCTION: util.memcache.safe_key
        KEY_PREFIX: dev.mongo_metadata_inheritance
        LOCATION:
        - localhost:11211
        TIMEOUT: 300
    staticfiles:
        BACKEND: django.core.cache.backends.memcached.MemcachedCache
        KEY_FUNCTION: util.memcache.safe_key
        KEY_PREFIX: dev.roverbyopenstax_general
        LOCATION:
        - localhost:11211

To reconfigure the LMS to use a remote Memcached server we should modify the ‘localhost:11211’ parameter from ‘localhost’ to the Internal IP address of the newly-created remote Memcached server following the format, ‘172.x.x.x:11211’. Note that we MUST address remote servers using internal IP addresses because otherwise our network traffic would leave and re-enter our Virtual Private Cloud which would not only be inefficient from a performance point of view but also insecure.

Do a full reboot of your Ubuntu server when you finish.

sudo reboot

Ok, that’s it for Memcached.

IV. Test Your Changes

Testing your platform is easier than it might seem. If you can login to both the LMS and CMS as any valid user then your migration was successful. You can monitor activity on your new remote Memcached server with the following commands:

# check memcached status locally
echo stats | nc 127.0.0.1 11211
memcstat --servers=127.0.0.1 11211

# check memcached status remotely
echo stats | nc 172.x.x.x 11211
memcstat --servers=172.x.x.x 11211

you can restart the memcached service with this command:

sudo service memcached restart

You can read more about monitoring Memcached from this blog article written by Aurelien Navarre, “How to monitor memcached