Learn pro tips on how to configure the Open edX platform. This tutorial will help you to better understand the organizational strategy behind the many Open edx  platform configuration files. You’ll also learn about the platform architecture and how the file system is organized.

Summary

Updated January 19, 2021 for the Open edX Koa release.

Open edX is a highly configurable, modular platform. In fact, there are well more than a thousand configuration parameters that you can use to tailor the functionality, behavior and infrastructure plan for your project. With some cursory knowledge of the Open edX platform architecture you’ll be able to use intuition and common sense to find the parameters that you need in order to tailor the platform to your needs. I have attempted to organize this article so that you can grab the information that you need as quickly as possible, only reading as much of the content as you really need to.

Note that this article focuses on “Application Configuration“. That is, how to modify application parameters and feature flags for the LMS and CMS. There are many other aspects of the Open edX platform that are also customizable using the same basic approach. The aspects of Open edX on which I spend most of my time are as follows.

  • Application Configuration

  • Site Theming

  • Horizontal Scaling

  • Nginx configuration

  • Certificate Server Configuration

  • E-commerce setup

You might also be interesting in my recent blog posts, “Scaling Open edX“, “Open edX Step-By-Step Production Installation Guide” and “Open edX Complete Backup Solution“.

A Practical Example

Let’s use the built-in Linux text editor Vim to set the platform display name. If you’re unfamiliar with Vim then this 8-minute video will teach you enough to get through this step of the tutorial.

# Step 1: Edit the LMS configuration file
sudo vim /edx/etc/lms.yml
# Find the parameter, "PLATFORM_NAME: : Your Platform Name Here" 
# located on or around row 465.
# Replace the text "Your Platform Name Here" 
# with a more appropriate name for your Open edX platform.

# Step 2: Reload the LMS configuration parameters.
sudo /edx/app/edxapp/reload_lms_config.sh

# Step 3: Check your work. Open the LMS in a browser window. 
# In the footer of the landing page you should see the text, 
# "© [YOUR NEW PLATFORM NAME]. All rights reserved except where noted. edX, Open edX and their respective logos are registered trademarks of edX Inc."

Pro Tips

1. ) lms.yml has most of what you need. The vast majority of the configuration parameters that you will need are located in the file /edx/etc/lms.yml, so take the time to learn about the contents of this file. lms.yml (and it’s cms.yml counterpart) contains 600 lines of alphabetized parameters that cover topics including: application feature flags, display names, SMTP email configuration and email addresses, connection settings for databases and caches, security passwords, CSRF and CORS parameters, names of browser cookies, custom theming parameters, and more.

2.) Maintain your configuration data offline. Rather than editing files directly on the server with Vim as I’ve shown in the exercise above, I store my configuration data in Github, and I use VS Code on my development workstation to edit the files. This not only gives you control over versions but also allows you the luxury of better real-time syntax checking and color formatting to help you to work better and faster.

3.) Keep it simple. Most likely you can tailor your Open edX platform to your specific needs using only the parameters included in the yaml files in /edx/etc/. Therefore, I’d recommend avoiding any other configuration methodologies and techniques that you might encounter in the official Open edX documentation. It is unlikely that you need to fork the edx/edx-platform repository for example, in which case you definitely do not need to use server-vars.yml. Devstack is not necessary, nor is Vagrant, nor is Docker.

Commonly Modified Files

Aside from custom theming, most of the other work that I do on Open edX projects only involves a dozen or so files. I’m presenting these in order of importance, meaning, the frequency with which I find myself editing each file.

edX platform These are the yaml files located in /edx/etc/, and entail nearly all of the salient content of this article. In addition to configuration files for the LMS and CMS, you’ll also find the main configuration files for e-commerce, discovery and Insights.
Passwords The master passwords file is located in /home/ubuntu/my-passwords.yml and was generated as one of the very first steps of your native build installation process. The native build procedures are coded in Ansible, and the ansible playbooks reference this file dozens of times while building your platform. Note that the password values in my-passwords.yml are really only for your reference. Changing password values in this file is not only a terrible idea, but it also would have no direct affect on any of the Open edX platform nor its subsystems, unless you re-run the respective Ansible playbooks (which you probably should not attempt unless you really, really really know what you’re doing.
Nginx You’ll find one Nginx configuration file per site located in /edx/app/nginx/sites-available/. The files are deftly named. I modify the files “lms” and “cms” in this folder as part of installing SSL certificates to enable https. Less commonly, I also modify these files whenever I’m adding a load balancer.
Django settings Pursuant to section, “Open edX Configuration File Hierarchy” below, I occasionally need to add custom parameters to a platform. Again, this is rare, and you probably will not need to do this. In the case of the LMS, I only ever need to modify /edx/app/edxapp/edx-platform/lms/envs/production.py.

Open edX Configuration File Hierarchy

The LMS and CMS are traditional Python/Django projects and are organized accordingly. Both applications rely on a file named “production.py” that contains the principal application parameters. Consistent with Django best practice, these two applications include a lower level configuration file named “common.py” that contains parameter values that are “common” to various application stages such as “dev”, “test” and “production”.

In the case of Open edX however, we edited the file, “/edx/etc/lms.yml”, which is a practice that is unique to the Open edX project. This approach provides several benefits:

  • For a typical Open edX project, all of the configuration settings are consolidated into a single folder, /edx/etc/, simplifying configuration management.
  • It is not necessary to modify any Open edX platform source code.
  • It provides a more human friendly format — Yaml files in this case — to modify configuration data.
  • /edx/etc is an upgrade-friendly folder location. That is, your configuration work will not get overwritten during normal software upgrades.

Note that these files are read once, during application startup. Thus after modifying lms.yml or cms.yml you’ll need to either restart the platform or execute the built-in bash script to reload the parameters. These commands, respectively are as follows.

# to reload the configuration parameters without restarting
sudo /edx/app/edxapp/reload_lms_config.sh

# To restart the platform
/edx/bin/supervisorctl restart lms
/edx/bin/supervisorctl restart cms
/edx/bin/supervisorctl restart edxapp_worker:

# If you're running an older version of Open edX then use this command instead
#/edx/bin/supervisorctl restart edxapp:

lms.yml Parameters Of Interest

lms.yml contains around 400 parameters, however, I ignore most them. Following are the parameters that I actually modify, grouped by purpose and presented by order of priority.

# platform name and description<br />
SITE_NAME: courses.school-of-rock.edu<br />
LMS_BASE: courses.school-of-rock.edu<br />
CMS_BASE: cms.school-of-rock.edu<br />
PLATFORM_NAME: Your Platform Name Here<br />
PLATFORM_DESCRIPTION: Your Platform Description Here<br />
STUDIO_NAME: Studio<br />
STUDIO_SHORT_NAME: Studio<br />
LMS_INTERNAL_ROOT_URL: http://courses.school-of-rock.edu<br />
LMS_ROOT_URL: http://courses.school-of-rock.edu<br />
ENTERPRISE_API_URL: http://courses.school-of-rock.edu/enterprise/api/v1<br />
ENTERPRISE_ENROLLMENT_API_URL: http://courses.school-of-rock.edu/api/enrollment/v1/<br />
LEARNER_PORTAL_URL_ROOT: https://learner-portal-courses.school-of-rock.edu<br />
PLATFORM_FACEBOOK_ACCOUNT: http://www.facebook.com/YourPlatformFacebookAccount<br />
PLATFORM_TWITTER_ACCOUNT: '@YourPlatformTwitterAccount'</p>
<p># Inbound email addresses<br />
# Note that i use the same email address for everything.<br />
ACTIVATION_EMAIL_SUPPORT_LINK: ''<br />
API_ACCESS_FROM_EMAIL: support@school-of-rock.edu<br />
API_ACCESS_MANAGER_EMAIL: support@school-of-rock.edu<br />
BUGS_EMAIL: support@school-of-rock.edu<br />
BULK_EMAIL_DEFAULT_FROM_EMAIL: support@school-of-rock.edu<br />
CONTACT_EMAIL: support@school-of-rock.edu<br />
DEFAULT_FEEDBACK_EMAIL: support@school-of-rock.edu<br />
DEFAULT_FROM_EMAIL: support@school-of-rock.edu<br />
PAYMENT_SUPPORT_EMAIL: support@school-of-rock.edu<br />
PRESS_EMAIL: support@school-of-rock.edu<br />
SERVER_EMAIL: support@school-of-rock.edu<br />
TECH_SUPPORT_EMAIL: support@school-of-rock.edu<br />
UNIVERSITY_EMAIL: support@school-of-rock.edu</p>
<p># Platform feature flags<br />
FEATURES:<br />
    ENABLE_BULK_ENROLLMENT_VIEW: false<br />
    ENABLE_COMBINED_LOGIN_REGISTRATION: true<br />
    ENABLE_CORS_HEADERS: true<br />
    ENABLE_CROSS_DOMAIN_CSRF_COOKIE: true<br />
    ENABLE_DISCUSSION_HOME_PANEL: true<br />
    ENABLE_DISCUSSION_SERVICE: true<br />
    ENABLE_EDXNOTES: false<br />
    ENABLE_GRADE_DOWNLOADS: true<br />
    ENABLE_INSTRUCTOR_ANALYTICS: false<br />
    ENABLE_LTI_PROVIDER: false<br />
    ENABLE_MOBILE_REST_API: false<br />
    ENABLE_OAUTH2_PROVIDER: false<br />
    ENABLE_SPECIAL_EXAMS: false<br />
    ENABLE_SYSADMIN_DASHBOARD: false<br />
    ENABLE_THIRD_PARTY_AUTH: true<br />
    PREVIEW_LMS_BASE: preview.courses.school-of-rock.edu</p>
<p># To enable comprehensive (custom) theming<br />
ENABLE_COMPREHENSIVE_THEMING: false<br />
COMPREHENSIVE_THEME_DIRS:<br />
- '/edx/app/edxapp/edx-platform/themes/'<br />
DEFAULT_SITE_THEME: 'schoolofrock'</p>
<p># Regionalization<br />
TIME_ZONE: America/New_York<br />
LANGUAGE_CODE: en</p>
<p># AWS services<br />
AWS_ACCESS_KEY_ID: null<br />
AWS_SECRET_ACCESS_KEY: null</p>
<p># to configure an SMTP email provide, such as AWS SES<br />
AWS_SES_REGION_ENDPOINT: email.us-east-1.amazonaws.com<br />
EMAIL_HOST: email-smtp.eu-west-3.amazonaws.com<br />
EMAIL_HOST_PASSWORD: BBajUCiAlvjHus3OtB3b1uR8T3ZDvLpjgu8uKA3P/kQr<br />
EMAIL_HOST_USER: 'AKIA4MMXYBZIQNSIVGTP'<br />
EMAIL_PORT: 465<br />
EMAIL_USE_TLS: true</p>
<p># AWS S3 storage bucket setup, for anciallary instructor course content<br />
# such as course content supplements like pdf documents and image files<br />
AWS_QUERYSTRING_AUTH: false<br />
AWS_S3_CUSTOM_DOMAIN: SET-ME-PLEASE (ex. bucket-name.s3.amazonaws.com)<br />
AWS_SES_REGION_NAME: us-east-1<br />
AWS_STORAGE_BUCKET_NAME: SET-ME-PLEASE (ex. bucket-name)<br />
FILE_UPLOAD_STORAGE_BUCKET_NAME: SET-ME-PLEASE (ex. bucket-name)</p>
<p># customization of new user registration form<br />
REGISTRATION_EXTRA_FIELDS:<br />
    city: hidden<br />
    confirm_email: hidden<br />
    country: required<br />
    gender: optional<br />
    goals: optional<br />
    honor_code: required<br />
    level_of_education: optional<br />
    mailing_address: hidden<br />
    terms_of_service: hidden<br />
    year_of_birth: optional</p>
<p># Google Analytics<br />
GOOGLE_ANALYTICS_ACCOUNT: null<br />
GOOGLE_ANALYTICS_LINKEDIN: ''<br />
GOOGLE_ANALYTICS_TRACKING_ID: ''<br />
GOOGLE_SITE_VERIFICATION_ID: ''</p>
<p># Session managment<br />
# only important if you add SSL/https or<br />
# if you want give the CMS its own site name<br />
SESSION_COOKIE_SECURE: true<br />
SESSION_COOKIE_DOMAIN: 'school-of-rock.edu'<br />
BASE_COOKIE_DOMAIN: courses.school-of-rock.edu<br />
CORS_ORIGIN_ALLOW_ALL: false<br />
CORS_ORIGIN_WHITELIST: [<br />
    'courses.school-of-rock.edu',<br />
    'cms.school-of-rock.edu'<br />
]
CROSS_DOMAIN_CSRF_COOKIE_DOMAIN: courses.school-of-rock.edu<br />
CROSS_DOMAIN_CSRF_COOKIE_NAME: native-csrf-cookie<br />
LOGIN_REDIRECT_WHITELIST:<br />
- cms.school-of-rock.edu

Open edX Architecture

Drilling down more holistically, the Open edX platform is a collection of interoperating web sites running on Ubuntu Linux. Most of these sites are built on the Python/Django framework and leverage a variety of front-end technologies, albeit there is a trend towards standardizing on ReactJS. The diagram below illustrates at a high level how these technologies interoperate while also highlighting modules and subsystems of interest.

Front end technologies of interest

Mind you, for normal configuration activities you have absolutely no reason whatsoever to modify any of these files. But in the interest of better understanding the platform, I’m outlining the principal technologies that are part of, or are part of compiling, the Open edX front end.

ReactJS is a component-based javascript library originally created by Facebook which has gained widespread support by the open source community. Originally conceived as a library to support single-page web apps, it has since grown further into enterprise computing via design strategies like micro frontends, as is the case in Open edX.

you should invest time in learning more about ReactJS because the edX team is gradually standardizing the front end on this library. I spent around three weeks getting myself proficient in early 2020 which was time well spent in my case as I’ve since been able to make some really substantive front-end modifications for clients by leveraging these new skills.

Note that ReactJS uses Typescript for its source code language rather than native Javascript ES6, so you’ll therefore find .jsx files in the edx-platform source code as opposed to .js files.

Node.js is a JavaScript runtime built on Chrome’s V8 JavaScript engine. In the case of Open edX, Node is used extensively in conjunction with most of the front-end javascript libraries. Most of the supporting component libraries are made available via Node Package Manager (NPM) and transpiled with Paver to create the production js bundles used at run-time.

Waffle is feature flipper for Django. You can define the conditions for which a flag should be active, and use it in a number of ways. Flags are the most robust, flexible method of rolling out a feature with Waffle. Flags can be used to enable a feature for specific users, groups, users meeting certain criteria (such as being authenticated, or superusers) or a certain percentage of visitors

SASS, or Syntactically-Awesome Style Sheets, is a CSS extension language that is commonly used on large complex sites like Open edX. SASS provides a better way for teams of designers to collaborate the creation and maintenance of CSS. There are an endless number of frameworks built with Sass. CompassBourbon, and Susy just to name a few.

Paver is a Python-based software project scripting tool along the lines of Make or Rake. It is not designed to handle the dependency tracking requirements of, for example, a C program. It is designed to help out with all of your other repetitive tasks (run documentation generators, moving files around, downloading things), all with the convenience of Python’s syntax and massive library of code.

In Open edX Paver is used to manage the asset compilation pipeline activities. If you’re unfamiliar with asset compilation then you can read more about it here, “Managing static files“. Paver is anecdotally similar to Gulp in the Javascript world.

Back end technologies of interest

Likewise, it is unlikely that you need to modify any of these files. I’m providing this information strictly in the interest of improving your understanding of the Open edX architecture.

The Django Framework is a model-view-template framework, and as it happens, there are a variety of template technologies that work with Django. The edX team uses Mako, a template library written in Python. It provides a familiar, non-XML syntax which compiles into Python modules for maximum performance. Mako’s syntax and API borrows from the best ideas of many others, including Django and Jinja2 templates, Cheetah, Myghty, and Genshi. Conceptually, Mako is an embedded Python (i.e. Python Server Page) language, which refines the familiar ideas of componentized layout and inheritance to produce one of the most straightforward and flexible models available, while also maintaining close ties to Python calling and scoping semantics.

For the LMS, all of the .html files located in /edx/app/edxapp/edx-platform/lms/templates/ are Mako templates. Furthermore, all Mako templates are stored in this root folder location.

For front-end work like theming I spend most of my time working with Mako, so this is a technology that is well worth learning. Mako is powerful and easy to learn. Though this really is a theming topic, it bears mentioning that with minor self-study on Mako you’ll be able to make significant modifications to the front end with little effort.

XBlock is the fundamental extensibility technology for Open edX. The XBlock specification is a component architecture designed to make it easier to create new online educational experiences. XBlock was developed by edX, which has a focus in education, but the technology can be used in web applications that need to use multiple independent components and display those components on a single web page.

edX courses’ most magical features tend to be implemented as XBlocks. The technology basically provides the means to pull other disparate kinds of web content into a course, and it furthermore provides a way to map users’ interactions with this content to the Open edX api. With XBlock technology users can do things like click on a certain part of an on-screen image to provide their response to an assignment problem.

PIP is a package manager for Python packages. A few things about pip are really helpful to know when it comes to configuration and understanding the overall Open edX organizational scheme. First, Open edX, like any well-managed Django project, uses virtual environments for each Django project. The virtual environment file location for the LMS and CMS and their installed pip packages, respectively are

  • /edx/app/edxapp/venvs/edxapp
  • /edx/app/edxapp/venvs/edxapp/lib/python3.8/site-packages/

Second, a lot of the Open edX source code is bundled as pip packages that are then installed into either/both of these locations. Lastly, Django projects always include a list of Python “requirements”, which are text files outlining the various pip packages and their respective version requirements. The pip requirements for LMS are stored in /edx/app/edxapp/edx-platform/requirements.

It is a worthwhile effort on your part to learn more about pip because Open edX software relies so much on this technology.

Ubuntu subsystems of interest

you absolutely should not modify any of the settings of these subsystems. The dev team at edX has already provided configuration parameters in /edx/etc/lms.yml and /edx/etc/cms.yml for any subsystem setting that might need to modified at any time in the lifecycle of your Open edX project; not matter how large it might scale.

Nginx is commonly used as a web server in Django sites. Nginx is a web server that can also be used as a reverse proxy, load balancer, mail proxy and HTTP cache. Nginx accelerates content and application delivery, improves security, facilitates availability and scalability for the busiest web sites on the Internet. It is a small, lightweight and very fast alternative to Apache web server.

Memcached is a performance optimization technology. The basic idea behind Memcached is quite simple. It takes a lot of processor overhead to instantiate Python objects, but once they’ve been instantiated its pretty effortless to serialize and retain these instance-by-instance in RAM memory in a simple key-value storage format. Memcached plays an extremely important role in Open edX’s impressive backend performance.

Technically speaking, it is a general-purpose distributed memory-caching system. It is often used to speed up dynamic database-driven websites by caching data and objects in RAM to reduce the number of times an external data source must be read. Memcached is free and open-source software, licensed under the Revised BSD license.

Gunicorn is a connector between Nginx and Django. In Django projects Nginx is specially configured so that for any URL that Nginx cannot serve directly from the Ubuntu file system (for example, a static file such as an image or CSS or JS file) it assumes that the URL should be served by Django via Gunicorn. Technically speaking, the Gunicorn “Green Unicorn” is a Python Web Server Gateway Interface HTTP server. It is a pre-fork worker model, ported from Ruby’s Unicorn project. The Gunicorn server is broadly compatible with a number of web frameworks, simply implemented, light on server resources and fairly fast.

This article explains more about how Nginx / Gunicorn / Django work together so serve up web pages, “How to use Django with uWSGI

Celery/RabbitMQ are backend subsystems that are commonly found in Django projects. These two technologies work together to enable most application processing on behalf of end users to be performed on background threads. If for example, a learner clicks a “submit” button on a homework assignment problem, the homework submission is queued to RabbitMQ and then eventually sent to Celery to be processed. Once the processing has completed, Celery can broadcast the processing results such that the learner’s browser will magically update itself at the appropriate moment, displaying the grading results.

The design objective of including Celery/RabbitMQ in the application stack is to provide a more immediate and performant user experience for the learner. The learner clicks the “submit” button and the browser responds immediately, enabling the learner to continue to interact with other controls on the browser page as opposed to the page completely freezing momentarily while the backend calculates the grading result.

Celery – Distributed Task Queue is a simple, flexible, and reliable distributed system to process vast amounts of messages, while providing operations with the tools required to maintain such a system. It’s a task queue with focus on real-time processing, while also supporting task scheduling. Celery has a large and diverse community of users and contributors, you should come join us on IRC or our mailing-list.

MySQL has a 20-year history in web backends, beginning with its use as the persistence layer in LAMP stacks. MySQL is an open-source relational database management system. In Open edX software MySQL is used to store user data such as usernames, passwords, and and learner grading results. Course data on the other hand, with only a couple of exceptions, is stored in MongoDB.

Open edX stores user data in MySQL, such as usernames and passwords, course enrollment records, and learner submissions to assignment problems and exams. I often use MySQL Workbench to browse table data contents as a way to better understand Python/Django code on which I’m working.

Open edX stores most course content in MongoDB, a cross-platform document-oriented database program. Classified as a NoSQL database program, MongoDB uses JSON-like documents with optional schemas. MongoDB is ideal for storing course content specifically because JSON is an ideal, semi-structured data language.

MongoDB is one of the cornerstone technologies that powers major Internet sites like Facebook, Youtube and Twitter. It’s appeal to platform architects stems from both its flexibility and its stability. It is flexible because the data language can encode as much information as is possible with a relational database; albeit without the rigidity of a relational database. Meanwhile, because the back-end data storage is simply text representations of JSON objects, it can not only be persisted directly to a Linux file system without the need to convert the contents to a binary format (as is the case with relational databases) but also this opens the possibility to use replication to “shard” the data, making the back-end platform incredibly resilient.

Elasticsearch is a search engine based on the Lucene library. It provides a distributed, multitenant-capable full-text search engine with an HTTP web interface and schema-free JSON documents. Elasticsearch is developed in Java.

File Layout for a Generic Django Project

Django projects share a common file organization scheme, which is wonderful because this will help you to quickly get acquainted with the multiple Django projects that makeup the Open edX platform. LMS and CMS are bundled into multiple individual Django projects (lms, cms, common, and openedx) all of which you’ll find in /edx/app/edxapp/edx-platform/. Additionally, there are individual projects for Discovery, E-commerce, and Certificates.