Tutor, Github Actions and AWS Elastic Container Registry are a powerful trio of tools for creating an automated CI process to build and register your custom Open edX Docker image, and automating the entire process is easy.
This is part I of a two-part series on implementing CI/CD processes with Tutor Open edx. In this first part we’ll automate a Tutor Open edX build. In part II we’ll learn how to deploy this build.
In this article I’ll explain the key pieces of this fully-functional Github Actions Build workflow, which does the following:
- builds a custom Open edX Docker image using a custom fork, a custom theme, an Open edX plugin, and one Xblock,
- registers the custom Docker image in AWS Elastic Container Registry.
With only minor modifications you can tailor this workflow to automate the build of your own Open edX installation.
Note
The code repository referenced in this article was generated with Cookiecutter OpenedX Devops, a completely free open source tool that helps you to create and maintain a robust, secure environment for your Open edX installation. The Github Actions workflow we review below is only one of the many fantastic devops tools that are provided completely free by Cookiecutter OpenedX Devops. If you want, you can follow the README instructions in the Cookiecutter to create your own repository, pre-configured with your own AWS account information, your Open edX platform domain name and so on. It’s pretty easy.
Using CI/CD Tools with Tutor Open edX
Tutor was formally introduced to the Open edX community at the 2019 annual Open edX conference in San Diego, California. It is a Docker-based build, configuration and deployment tool that greatly simplifies the complexity and the knowledge base that is required on your part to manage an Open edX installation. Tutor became the official Open edX installation tool beginning with the Maple release in fall of 2021. In this article we’ll focus on Tutor’s build function, which provides a 1-click way of creating a custom Docker image (aka “container”) of your Open edX installation. Tutor’s build function assembles all of the source code repositories and support libraries for your Open edX installation into a single Docker container which can then be deployed pretty much anywhere you like. It does the following:
- download github.com/openedx/edx-platform, or alternatively, a fork of this repository
- download all dependent source code repositories, noting that the code within edx-platform references many other repos
- download run times for Python, Django, NodeJS, React and all of the many other systems libraries. There are many
- download all Python PyPi requirements. There are many
- download any custom XBlocks that you want to add to your Open edX installation
- download your Open edX plugin, if you have one
- download your custom theme, if you have one
- compile static assets
- move all of these components into a Docker containerized format
That’s a lot of steps, and to be sure, this the vast majority of what happens under the hood during a traditional native installation process for versions of Open edX prior to Maple. It would seem miraculous that this doesn’t result in complete anarchy in light of the fact that the combined code base inside the Docker container is maintained real-time by dozens of different developer teams from different organizations who mostly don’t directly communicate with each other. But, the reality is that you can repeat the build process, achieving the exact same result each time because all of these repositories use semantic versioning, and Open edX and Tutor pin all of the versions on which they each depend.
The benefit of building a Docker image as opposed to installing the same software directly onto an Ubuntu instance is that you build the container once and then store it in a container Registry — AWS ECR in our case but there are many alternatives — and then afterwards you can deploy it anywhere pretty easily using simple Tutor commands. I’m not going into any detail about the Tutor build itself because it’s a comparatively simple operation that is already very well documented. Contrastly, this article focuses on how to combine the Tutor build procedure with other open source tools to implement robust continuous integration and continuous delivery (CI/CD) processes for your Open edX installation. I’ve been using this methodology on my larger Open edX sites for about a year now and it works great. In this article we leverage Github Actions to fully automate a Docker build, but you could use any other CI platform.
Now then, before we dive into the build workflow, I want to digress for a moment on why incorporating CI is beneficial. GitHub Actions is a popular and mostly-free CI/CD platform that allows you to automate your build, test, and deployment pipeline. Docker itself is highly conducive to the general principal of CI/CD. Github Actions can be triggered to run automatically upon, for example, any pull request to any repository that is part of your Open edX image. I became a fan of Github Actions about 18 months ago while working as part of a team on a large installation. It speeds up and simplifies the development pipeline for all of the team members by automating tasks such as kicking off unit tests each time code is pushed to a repository. It’s coded in yaml format and is very easy to learn and to read. It’s stored inside of your repository, right alongside your code and configuration data. It provides consistency in the build and deployment pipelines, especially when there are many steps to your build, like in the example we’re going to review below. It provides granular role-based permissions to your team and your systems user accounts allowing you to harden security around your deployment work flows. It provides a great set of tools for managing passwords and other sensitive data. and finally, it generates logs of each of your deployments which is enormously helpful when you need to trouble shoot something. So, in a few words, it’s valuable technology that you should consider adding to your repertoire.
Github Actions Workflow
The example Github Action Build workflow uses Tutor to build a custom Open edX Docker image and then upload it to AWS Elastic Container Registry (ECR).
We’re going to use this Github Actions workflow to automate the following operations
- setup our workflow environment: create a virtual instance of Ubuntu and then install Tutor and the aws-cli
- authentication to the aws cli using a special AWS IAM user account named ci. The key and secret are stored in Github Secrets in the same repository. We’ll cover this in more detail below
- leverage a prebuilt Github Action named actions/checkout@v2 for downloading all of the code repositories
- leverage a prebuilt Github Action named docker/setup-buildx-action@v1 to manage the Docker build
- leverage a prebuilt Github Action named aws-actions/amazon-ecr-login@v1 to manage our interactions with AWS ECR
- configure Tutor
- build a Docker image
- push it to AWS ECS
I should point out that there are many prebuilt Github Actions and in general the big vendors like AWS, Azure, Google and Digital Ocean all provide high quality prebuilt Github Actions to facilitate integrations into their respective platforms. This Github organization alone maintains nearly 50 production-ready actions that do anything from setting up Python and virtual environments for you to speeding up your workflow with caching. Our example build workflow is pretty well documented, so I’m going to spend the remainder of this article explaining a few of the recurring patterns that you’ll encounter in this code.
Layout of a Github Actions workflow
The entire workflow is written in yaml using a limited set of commands that you can easily learn from this Getting Started guide. The workflow runs on a Github-hosted virtual server instance. Github gives you 2,000 minutes of server time for free each month which should be more than you need in most cases. The example build workflow in this article consumes around 35 minutes each times it runs, and I usually run the work flow only a couple of times a month at most. The server instances are ephemeral and are destroyed immediately upon completion of the build workflow. You therefore have to create your entire build environment each time the workflow runs.
Per the screen shot below, this workflow runs on “workflow_dispatch” (row 16) which is a lofty way of saying that it runs when you click the “Run” button from the Github Actions console page of your repository on the github.com site. We define our workflow environment (on row 20) as an Ubuntu 20.04 server on which we’ll need to install Tutor and AWS CLI at the beginning of the workflow. In this section we also define a few environment-wide variables that are referenced throughout the remaining code, namely, the unique identifier for our container in AWS ECR.
More about steps
I mostly learned about steps by looking at sample code. Fortunately, this example workflow contains a broad mixture of most of the kinds of things that you’ll want to do with your own workflow, and so it should be a pretty good starting point. The screen shot below demonstrates a couple of use cases of steps created with prebuilt actions along with one example of a multi line command format.
Running tutor from inside a Github Actions workflow
Interacting with Tutor inside of a Github Actions workflow is pretty straightforward once you’ve seen a few examples. It obviously requires that you first understand the exact steps that you’d execute from the command-line, which you can learn more about here: https://docs.tutor.overhang.io/. Thereon, you mostly just need to see working examples of the syntax for how to code different use cases.
For example, these two steps in the screen shot below build the image and then push it to AWS ECR. At a glance, it’s pretty intuitive. The build process takes around 35 minutes by the way, which is another reason why it’s a great idea to use Docker so as to minimize the occasions when you have to endure this long-running process.
Managing passwords and other sensitive data
Github Actions provides an excellent way for you to integrate passwords and other sensitive data into your workflow without risk of it leaking into the public domain. See “Settings” -> “Secrets” -> “Actions” in your repository for the console screen where you can define and store all sensitive data that you need to integrate into your workflow. In the example workflow we use a single key pair for the AWS CLI. There’s a 2nd key pair defined in the secrets section of the repository, AWS_SES_IAM_KEY / SECRET that is used elsewhere in the repo, but not this particular workflow. Lastly, we define a Github Personal Access Token (PAT) that determines the workflow’s permissions within Github Actions itself during execution. See the screen shot above for example syntax on how to reference this data from within the Github Action workflow: “${{ secrets.THE-NAME-OF-YOUR-SECRET }}”
Alert
Keeping your AWS credentials away from prying eyes is serious business. Take a look at this article for a firsthand accounting of the horrors that await you if your ci credentials ever leak into the public domain by, for example, you accidentally pushing these to Github, “Enemy at the Gates“.
Running the Github Actions workflow
Once you add a workflow to your repository its Github Actions console page will magically reformat itself into something similar to the screen shot below, noting of course that the “Run Workflow” button appears because we explicitly included the command, “on: [workflow_dispatch]” on row 18.
Each run of the workflow contains a detailed timestamped log of the console output from the Ubuntu instance. It is neatly organized by the text name of each step in the the workflow, on which you can click to drill-down to see the detailed output.
Good luck with automating your Open edX build!! I hope you found this helpful. Contributors are welcome. My contact information is on my web site. Please help me improve this article by leaving a comment below. Thank you!
Leave A Comment