The good news is that the user data is stored in MySQL, a well-known database management system. The bad news is that there’s more to restoring your Open edX platform than simply restoring the databases. Get your Open edX instance back online in less than an hour with this how-to guide.
Summary
There are a few details to successfully restoring an Open edX instance that are definitely worth understanding well before the need arises. If you haven’t done so already, make sure to take a look at my blog post, “Complete Backup Solution for Open edX“. There are multiple kinds of persistent data in an Open edX platform, and these naturally rely on different kinds of technology and store. Whether you’re migrating to a new server or restoring an environment that suffered a catastrophic failure, you’ll need to consider all of the these:
Data | Location |
---|---|
User Data | MySQL |
Course Data | MongoDB |
Asynchronous Task Data | RabbitMQ |
Code Customizations | /edx/app/edxapp/edx-platform/ |
Configuration | /edx/app/edxapp/ |
Custom Theme | * wherever you’ve elected to host this |
Passwords | * usually /home/ubuntu/my-passwords.yml |
Course Videos | * hopefully youtube.com |
This article focuses on the two truly critical data sources, MySQL and MongoDB, plus a couple of common pitfalls that can prevent your restored platform from running properly.
First, the Django framework relies on a process called “Database Migrations” to ensure that the physical database schema in MySQL is consistent with the Django objects described in the source code. That is, Django programmers do not directly modify the MySQL database schema, but instead rely on Database Migrations to handle this for them. When you restore an Open edX MySQL database, you have to consider the possibility that the physical schema of the backup differs from that of the Django codebase.
Second, Open edX relies extensively on a subsystem named RabbitMQ to asynchronously manage tasks. By “task” I’m referring to virtually every command button that a learner clicks while interacting with course data. RabbitMQ is invoked each time they provide a response to a problem, each time they interact with the discussion forum, provide a comment, record data in Notes/Annotations, request a password reset, and so on. These tasks are queued and then run in a first-in-first-out queue based on available server and network resources. Depending on the circumstances surrounding your need to restore or migrate your Open edX instance, there might be many hundreds or thousands (or millions) of pending RabbitMQ tasks in queue. If that’s the case then it would be prudent on your part to at least attempt to migrate these tasks as well. Furthermore, there are some common problems with migrating and/or restoring RabbitMQ configuration settings that we’ll look at in more detail below.
Restore Procedure
* Download your backup files to the Ubuntu local file system
If your backup files are stored remotely then you’ll need to download a copy of the MySQL and MongoDB backup sets to your local Ubuntu file system. If you followed my instructions from my article, “Open edX Complete Backup Solution” then you can follow these instructions to download your compressed backup files from your AWS S3 bucket to your Ubuntu local file system.
Make your AWS S3 backup file public, then copy the file Link (URL).
Download and uncompress your backup set
Once your file has been made public you can download a copy of it to your local file system using the Linux wget command.
[bash]# Download the backup tarball to the current directory.
wget https://s3.amazonaws.com/[YOUR-BUCKET-NAME]/backups/openedx-data-20180919T175920.tgz
# Uncompress the tarball into the current directory
tar xvzf openedx-data-20180919T175920.tgz
# Where
# x: This option tells tar to extract the files.
# v: The “v” stands for “verbose.” This option will list all of the files one by one in the archive.
# z: The z option is very important and tells the tar command to uncompress the file (gzip).
# f: This options tells tar that you are going to give it a file name to work with.[/bash]
1. Restore MySQL Databases
Nearly all of your users’ data is stored in MySQL, including usernames and passwords, course content responses, notes & annotations data, their profile and so on. If you followed my guidelines on Creating a Complete Backup Solution for Open edX then your MySQL dump contains all of the Open edX databases and none of the MySQL system databases, which is exactly what you want. Restoring from your MySQL dump will therefore be as simple as the following:
[bash]mysql -u root -p < db_backup.dump[/bash]
That’s it. You do not need to restart MySQL, nor flush any caches or buffers, nor do any other administrative tasks. MySQL is remarkably resilient in this respect. However it is really important that your perform Database Migrations in the next section.
2. Run Database Migrations
This process is simple to run and usually only takes a minute or so to complete. Running this procedure more than once will not harm your database. Make Migrations scans the Django objects in your Open edX application codebases to ensure that the physical database tables, fields and relationships are consistent. It automatically adds anything that is missing, and it keeps track of what it’s done.
[bash]sudo -H -u edxapp -s bash
cd ~
source /edx/app/edxapp/edxapp_env
python /edx/app/edxapp/edx-platform/manage.py lms makemigrations –settings=aws
python /edx/app/edxapp/edx-platform/manage.py lms migrate –settings=aws
python /edx/app/edxapp/edx-platform/manage.py cms makemigrations –settings=aws
python /edx/app/edxapp/edx-platform/manage.py cms migrate –settings=aws[/bash]
3. Restore MongoDB
Mongo is strangely simple to restore. Here’s the basic structure of the command:
[bash]mongorestore ~/backups/path-to-mongodb-backup-folder/ –username admin –password "STRONG PASSWORD FROM my-password.yml in your home folder"[/bash]
You’ll find additional information in the Official MongoDB Documentation.
* If mongo reports this error, “BadValue Invalid or no user locale set. Please ensure LANG and/or LC_* environment variables are set correctly” then issue this command in your terminal window:
[bash]export LC_ALL=C[/bash]
* if mongo reports a long list of errors like this, “- E11000 duplicate key error collection: edxapp.modulestore.structures index: _id_ dup key: { : ObjectId(‘5bce68ccb9eb65205cf444d6’) }” then you should delete the existing Mongo edxapp database and then attempt the restore procedure again:
[bash]mongo –port 27017 -u "admin" -p "STRONG PASSWORD FROM my-password.yml in your home folder" –authenticationDatabase "admin"
use edxapp;
db.dropDatabase();
exit[/bash]
4. Restart Platform
Given that you just restored all of your MongoDB course data, plus multiple MySQL databases and you potentially made schema modifications via Database Migrations, restarting the Open edX platform is a prudent idea. For most administrative tasks you only need to restart the LMS and CMS but in this case its a good idea to restart everything.
[bash]# Option I: reboot the server
sudo reboot
#Option II: restart the Open edX services individually
sudo /edx/bin/supervisorctl restart lms
sudo /edx/bin/supervisorctl restart cms
sudo /edx/bin/supervisorctl restart edxapp_worker:
sudo /edx/bin/supervisorctl restart analytics_api
sudo /edx/bin/supervisorctl restart certs
sudo /edx/bin/supervisorctl restart discovery
sudo /edx/bin/supervisorctl restart ecommerce
sudo /edx/bin/supervisorctl restart ecomworker
sudo /edx/bin/supervisorctl restart forum
sudo /edx/bin/supervisorctl restart insights
sudo /edx/bin/supervisorctl restart notifier-celery-workers
sudo /edx/bin/supervisorctl restart notifier-scheduler
sudo /edx/bin/supervisorctl restart xqueue
sudo /edx/bin/supervisorctl restart xqueue_consumer[/bash]
5. Perform Diagnostics
Hopefully your Open edX instance is running now. If so, then you should next review the application logs for both the LMS and CMS to look for errors.
[bash]tail /edx/var/log/lms/edx.log -n 50
tail /edx/var/log/cms/edx.log -n 50[/bash]
In particular, Celery, a component of RabbitMQ, often presents some challenges after migrations and database restore operations. If Celery is not functioning correctly then you’ll find a lot of errors in the LMS log with the general form of the following:
6. Trouble-Shooting Celery / RabbitMQ (When restoring Hawthorn or older versions)
RabbitMQ (and Celery) was installed by Ansible when you performed your native build. While there are many steps to installing RabbitMQ, it turns out that the configuration itself is relatively simple and thus, easy to trouble-shoot since there are a finite and limited set of configuration values to check. The configuration consists of the following
- Two Celery configuration values located in /etc/rabbitmq/rabbitmq-env.conf
- Three Celery usernames with passwords, and assigned permissions
- One virtual host
You can attempt any combination of the following trouble-shooting methods, testing your results after each adjustment by attempting any operation in your LMS such as providing a response to any problem, or by requesting a password reset email.
Celery Trouble-Shooting Tip I: Verify the IP address in /etc/rabbitmq/rabbitmq-env.conf
The correct internal IP address for address RabbitMQ is 127.0.0.1. However, sometimes Ansible will incorrectly populate this value with the actual value of the server’s internal IP address, such as for example, 172.16.102.101. I often encounter this problem whenever I reinstall RabbitMQ during platform upgrades.
[bash]sudo vim /etc/rabbitmq/rabbitmq-env.conf[/bash]
Edit this file if necessary, and then restart the RabbitMQ service.
Celery Trouble-Shooting Tip II: Set permissions of all Celery users
The following code block relaxes permissions for the username “celery”. This is anecdotally the same as setting permissions of a Linux file to “777”. If the source of your Celery problem is permissions then this will eliminate the problem, noting however that afterwards you should seek more information on the ramifications of relaxing Celery permissions in Open edX (sorry, but I’m no expert).
[bash]sudo rabbitmqctl set_permissions -p / celery ".*" ".*" ".*"
sudo service rabbitmq-server restart[/bash]
Celery Trouble-Shooting Tip III: Reset Celery user passwords
If you followed my guidelines for a Native Build on Ubuntu 16.04 LTS then you (hopefully) have a file named my-passwords.yml located in /home/ubuntu. Per the illustration below, the passwords for the three Celery users is located at the bottom of this file, noting that in each case the value of the password is referenced from elsewhere in the same document. I’ve attempted to illustrate how this referencing scheme works by highlighting the appropriate row in the file for the “Admin” user’s password.
[bash]sudo rabbitmqctl change_password celery YourPasswordForTheCeleryUser
sudo rabbitmqctl change_password edx YourPasswordForTheEdxUser
sudo rabbitmqctl change_password admin YourPasswordForTheAdminUser
sudo service rabbitmq-server restart[/bash]
Celery Trouble-Shooting Tip IV: Re-install Celery
Some combination of the previous trouble-shooting methods very likely will solve your problem. But, if you’re still having problems then you can completely install RabbitMQ by calling the appropriate Ansible playbook, as follows:
[bash]sudo bash
./edx/app/edx_ansible/venvs/edx_ansible/bin/activate
cd /edx/app/edx_ansible/edx_ansible/playbooks/
ansible-playbook -c local -i ‘localhost,’ ./run_role.yml -e "role=rabbitmq"
#Use this command instead if you are using a server-vars.yml file
#ansible-playbook -c local -i ‘localhost,’ ./run_role.yml -e "role=rabbitmq" -e@/edx/app/edx_ansible/server-vars.yml
exit
sudo service rabbitmq-server restart[/bash]
Last thing, I found the following two threads from the Open edX Devops Google Group very helpful the first time I first encountered problems with Celery:
7. Re-Installing A Custom Theme
If your site uses comprehensive theming and you’ve restored your custom theme from a backup then it’s probable that you also need to recompile your static assets with Paver. Take note that this process runs for around 15 minutes, and your Open edX platform will not be available until the process completes. Also be aware that if your theme contains any compilation errors then your Open edX platform will almost certainly break.
[bash]# update assets as edxapp user
sudo -H -u edxapp bash
source /edx/app/edxapp/edxapp_env
cd /edx/app/edxapp/edx-platform
paver update_assets lms –settings=aws
paver update_assets cms –settings=aws
exit
# restart edx instances
/edx/bin/supervisorctl restart lms
/edx/bin/supervisorctl restart cms
/edx/bin/supervisorctl restart edxapp_worker:[/bash]
I hope you found this helpful. Please help me improve this article by leaving a comment below. Thank you!
notifier-celery-workers Exited too quickly
log:
OperationalError: database is locked
Hi Lawrence. This is the error I am getting
Applying djcelery.0001_initial…Traceback (most recent call last):
File “/edx/app/edxapp/edx-platform/manage.py”, line 123, in
execute_from_command_line([sys.argv[0]] + django_args)
File “/edx/app/edxapp/venvs/edxapp/local/lib/python2.7/site-packages/django/core/management/__init__.py”, line 364, in execute_from_command_line
utility.execute()
File “/edx/app/edxapp/venvs/edxapp/local/lib/python2.7/site-packages/django/core/management/__init__.py”, line 356, in execute
self.fetch_command(subcommand).run_from_argv(self.argv)
File “/edx/app/edxapp/venvs/edxapp/local/lib/python2.7/site-packages/django/core/management/base.py”, line 283, in run_from_argv
self.execute(*args, **cmd_options)
File “/edx/app/edxapp/venvs/edxapp/local/lib/python2.7/site-packages/django/core/management/base.py”, line 330, in execute
output = self.handle(*args, **options)
File “/edx/app/edxapp/venvs/edxapp/local/lib/python2.7/site-packages/django/core/management/commands/migrate.py”, line 204, in handle
fake_initial=fake_initial,
File “/edx/app/edxapp/venvs/edxapp/local/lib/python2.7/site-packages/django/db/migrations/executor.py”, line 115, in migrate
state = self._migrate_all_forwards(state, plan, full_plan, fake=fake, fake_initial=fake_initial)
File “/edx/app/edxapp/venvs/edxapp/local/lib/python2.7/site-packages/django/db/migrations/executor.py”, line 145, in _migrate_all_forwards
state = self.apply_migration(state, migration, fake=fake, fake_initial=fake_initial)
File “/edx/app/edxapp/venvs/edxapp/local/lib/python2.7/site-packages/django/db/migrations/executor.py”, line 244, in apply_migration
state = migration.apply(state, schema_editor)
File “/edx/app/edxapp/venvs/edxapp/local/lib/python2.7/site-packages/django/db/migrations/migration.py”, line 129, in apply
operation.database_forwards(self.app_label, schema_editor, old_state, project_state)
File “/edx/app/edxapp/venvs/edxapp/local/lib/python2.7/site-packages/django/db/migrations/operations/models.py”, line 97, in database_forwards
schema_editor.create_model(model)
File “/edx/app/edxapp/venvs/edxapp/local/lib/python2.7/site-packages/django/db/backends/base/schema.py”, line 319, in create_model
self.execute(sql, params or None)
File “/edx/app/edxapp/venvs/edxapp/local/lib/python2.7/site-packages/django/db/backends/base/schema.py”, line 136, in execute
cursor.execute(sql, params)
File “/edx/app/edxapp/venvs/edxapp/local/lib/python2.7/site-packages/django/db/backends/utils.py”, line 64, in execute
return self.cursor.execute(sql, params)
File “/edx/app/edxapp/venvs/edxapp/local/lib/python2.7/site-packages/django/db/utils.py”, line 94, in __exit__
six.reraise(dj_exc_type, dj_exc_value, traceback)
File “/edx/app/edxapp/venvs/edxapp/local/lib/python2.7/site-packages/django/db/backends/utils.py”, line 62, in execute
return self.cursor.execute(sql)
File “/edx/app/edxapp/venvs/edxapp/local/lib/python2.7/site-packages/django/db/backends/mysql/base.py”, line 101, in execute
return self.cursor.execute(query, args)
File “/edx/app/edxapp/venvs/edxapp/local/lib/python2.7/site-packages/MySQLdb/cursors.py”, line 205, in execute
self.errorhandler(self, exc, value)
File “/edx/app/edxapp/venvs/edxapp/local/lib/python2.7/site-packages/MySQLdb/connections.py”, line 36, in defaulterrorhandler
raise errorclass, errorvalue
django.db.utils.OperationalError: (1050, “Table ‘djcelery_crontabschedule’ already exists”)
edxapp@ip-172-31-44-237:~$
Hi Lawrence, Sorry to bother you again.. but I have been trying to get the upgraded done but I have come across so many errors so far. The most recent is the error I get once I try to restart lms after doing the migration. I get a spawn error and if check the status I get this:
analytics_api RUNNING pid 11689, uptime 3:30:43
certs FATAL Exited too quickly (process log may have details)
cms RUNNING pid 21773, uptime 2:09:29
discovery RUNNING pid 5827, uptime 3:11:14
ecommerce RUNNING pid 22402, uptime 3:34:13
ecomworker RUNNING pid 23577, uptime 3:33:39
edxapp_worker:cms_default_1 RUNNING pid 9503, uptime 3:52:41
edxapp_worker:cms_high_1 RUNNING pid 9533, uptime 3:52:39
edxapp_worker:cms_low_1 RUNNING pid 9541, uptime 3:52:38
edxapp_worker:lms_default_1 RUNNING pid 9583, uptime 3:52:37
edxapp_worker:lms_high_1 RUNNING pid 9723, uptime 3:52:35
edxapp_worker:lms_high_mem_1 RUNNING pid 9831, uptime 3:52:34
edxapp_worker:lms_low_1 RUNNING pid 9925, uptime 3:52:33
forum RUNNING pid 15299, uptime 3:07:55
insights RUNNING pid 23356, uptime 3:26:29
lms FATAL unknown error making dispatchers for ‘lms’: EACCES
notifier-celery-workers RUNNING pid 10056, uptime 3:09:55
notifier-scheduler RUNNING pid 10009, uptime 3:10:16
xqueue RUNNING pid 12029, uptime 3:09:09
xqueue_consumer RUNNING pid 12066, uptime 3:09:07.
Kindly help out if you can.
Hi Lawrence, when I ran this command to upgrade mysql:
python /edx/app/edxapp/edx-platform/manage.py cms makemigrations –settings=aws
there is an error saying below. What could have been go wrong? I installed hawthorn master fresh, but then restore ginkgo 2.1 database to it cause I want to migrate my ginkgo 2.1 to hawthorn master
Traceback (most recent call last):
File “/edx/app/edxapp/edx-platform/manage.py”, line 117, in
startup = importlib.import_module(edx_args.startup)
File “/usr/lib/python2.7/importlib/__init__.py”, line 37, in import_module
__import__(name)
File “/edx/app/edxapp/edx-platform/cms/startup.py”, line 9, in
settings.INSTALLED_APPS # pylint: disable=pointless-statement
File “/edx/app/edxapp/venvs/edxapp/local/lib/python2.7/site-packages/django/conf/__init__.py”, line 56, in __getattr__
self._setup(name)
File “/edx/app/edxapp/venvs/edxapp/local/lib/python2.7/site-packages/django/conf/__init__.py”, line 41, in _setup
self._wrapped = Settings(settings_module)
File “/edx/app/edxapp/venvs/edxapp/local/lib/python2.7/site-packages/django/conf/__init__.py”, line 110, in __init__
mod = importlib.import_module(self.SETTINGS_MODULE)
File “/usr/lib/python2.7/importlib/__init__.py”, line 37, in import_module
__import__(name)
File “/edx/app/edxapp/edx-platform/cms/envs/aws.py”, line 90, in
ENV_TOKENS = json.load(env_file)
File “/usr/lib/python2.7/json/__init__.py”, line 291, in load
**kw)
File “/usr/lib/python2.7/json/__init__.py”, line 339, in loads
return _default_decoder.decode(s)
File “/usr/lib/python2.7/json/decoder.py”, line 364, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File “/usr/lib/python2.7/json/decoder.py”, line 382, in raw_decode
raise ValueError(“No JSON object could be decoded”)
ValueError: No JSON object could be decoded
edxapp@el:~$ python /edx/app/edxapp/edx-platform/manage.py cms makemigrations –settings=aws
Traceback (most recent call last):
File “/edx/app/edxapp/edx-platform/manage.py”, line 117, in
startup = importlib.import_module(edx_args.startup)
File “/usr/lib/python2.7/importlib/__init__.py”, line 37, in import_module
__import__(name)
File “/edx/app/edxapp/edx-platform/cms/startup.py”, line 9, in
settings.INSTALLED_APPS # pylint: disable=pointless-statement
File “/edx/app/edxapp/venvs/edxapp/local/lib/python2.7/site-packages/django/conf/__init__.py”, line 56, in __getattr__
self._setup(name)
File “/edx/app/edxapp/venvs/edxapp/local/lib/python2.7/site-packages/django/conf/__init__.py”, line 41, in _setup
self._wrapped = Settings(settings_module)
File “/edx/app/edxapp/venvs/edxapp/local/lib/python2.7/site-packages/django/conf/__init__.py”, line 110, in __init__
mod = importlib.import_module(self.SETTINGS_MODULE)
File “/usr/lib/python2.7/importlib/__init__.py”, line 37, in import_module
__import__(name)
File “/edx/app/edxapp/edx-platform/cms/envs/aws.py”, line 90, in
ENV_TOKENS = json.load(env_file)
File “/usr/lib/python2.7/json/__init__.py”, line 291, in load
**kw)
File “/usr/lib/python2.7/json/__init__.py”, line 339, in loads
return _default_decoder.decode(s)
File “/usr/lib/python2.7/json/decoder.py”, line 364, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File “/usr/lib/python2.7/json/decoder.py”, line 382, in raw_decode
raise ValueError(“No JSON object could be decoded”)
ValueError: No JSON object could be decoded
hi chris, it looks like your Hawthorn native build did not complete. you can confirm by reviewing the file /home/ubuntu/install.out and possible also /home/untuntu/nohup.out. install.out should contain a neat report of the Ansible activities for the build, of which there are upwards of 400 tasks. if you don’t see this report at the bottom of install.out then it definitely did not finish for some reason. often you can simply re-run the installation procedure and it’ll pick up where it left off.
Hi Lawrence, there was an error saying Ansible failed. I tried to rerun the installation script but the same error occured. Here is from the log. Any advice?
TASK [edx_django_service : run post-migrate commands] **************************
fatal: [localhost]: FAILED! => {“failed”: true, “msg”: “{{ ecommerce_post_migrate_commands }}: [{u’when’: True, u’command’: u’./manage.py oscar_populate_coun
tries –initial-only’}, {u’when’: u'{{ ecommerce_create_demo_data }}’, u’command’: u’./manage.py create_or_update_site –site-id=1 –site-domain={{ ECOMMERCE
_ECOMMERCE_URL_ROOT.split(\”://\”)[1] }} –partner-code=edX –partner-name=\”Open edX\” –lms-url-root={{ ECOMMERCE_LMS_URL_ROOT }} –client-side-payment-pro
cessor=cybersource –payment-processors=cybersource,paypal –client-id={{ ECOMMERCE_SOCIAL_AUTH_EDX_OIDC_KEY }} –client-secret={{ ECOMMERCE_SOCIAL_AUTH_EDX_
OIDC_SECRET }} –from-email staff@example.com –discovery_api_url={{ ECOMMERCE_DISCOVERY_SERVICE_URL }}/api/v1/ –journals_api_url={{ JOURNALS_API_URL }}’},
{u’when’: u'{{ ecommerce_create_demo_data }}’, u’command’: u’./manage.py create_demo_data –partner=edX’}]: ‘JOURNALS_API_URL’ is undefined”}
NO MORE HOSTS LEFT *************************************************************
to retry, use: –limit @/var/tmp/configuration/playbooks/edx_sandbox.retry
Hi Lawrence I am also trying the same operation. I got the error: System check identified some issues:
WARNINGS:
?: (mysql.W002) MySQL Strict Mode is not set for database connection ‘default’
HINT: MySQL’s Strict Mode fixes many data integrity problems in MySQL, such as data truncation upon insertion, by escalating warnings into errors. It is strongly recommended you activate it. See: https://docs.djangoproject.com/en/1.11/ref/databases/#mysql-sql-mode
?: (mysql.W002) MySQL Strict Mode is not set for database connection ‘read_replica’
HINT: MySQL’s Strict Mode fixes many data integrity problems in MySQL, such as data truncation upon insertion, by escalating warnings into errors. It is strongly recommended you activate it. See: https://docs.djangoproject.com/en/1.11/ref/databases/#mysql-sql-mode
?: (mysql.W002) MySQL Strict Mode is not set for database connection ‘student_module_history’
HINT: MySQL’s Strict Mode fixes many data integrity problems in MySQL, such as data truncation upon insertion, by escalating warnings into errors. It is strongly recommended you activate it. See: https://docs.djangoproject.com/en/1.11/ref/databases/#mysql-sql-mode
these are actually information-only messages from mysql. the restore should work anyway. regardless however, you can read more here — https://django-mysql.readthedocs.io/en/latest/checks.html — on how you can tailor the behavior of Django’s interaction with MySQL. hope that helps!