The rapid adoption of Tutor as a build and deployment solution for Open edX has also opened the door to using Kubernetes. But what is Kubernetes and what problem does it solve for an Open edX platform? Well, plenty! Read on to see what, why, and how you can begin leveraging this next-generation system management technology.

What are Kubernetes, anyway?

The name Kubernetes originates from Greek, meaning helmsman or pilot. K8s as an abbreviation results from counting the eight letters between the “K” and the “s”. Open-sourced in 2014, the Kubernetes project combines over 15 years of Google’s experience running production workloads at scale with best-of-breed ideas and practices from the community. But in hard concrete terms, what problems does it solve for an Open edX installation, and why would you want to use it?

“Kubernetes is a portable, extensible, open source platform for managing containerized workloads and services.”, so says the Kubernetes web site. Let’s unpack that statement in strict, Open edX terms, beginning with the part about, “containerized workloads and services“.

First of all,  what containers? Where did they come from? What do containers have to do with Open edX software? Well, it turns out that Tutor converts the traditional monolithic Open edX platform into a collection of containers, and so if you’re installing with Tutor then that part has already been taken care of. Not only that, Tutor provides native support for Kubernetes, so you really don’t have to do much of anything other than decide that you want to deploy to a Kubernetes cluster.

Thanks Régis!

Let’s defer definitions of “portable and extensible” for the moment, and instead I’ll add my own thoughts. Kubernetes orchestrates the collection of containers that Tutor creates at deployment. That is, if you create a Kubernetes cluster consisting of more than one virtual Linux server instance then Kubernetes takes responsibility for deciding on which server to place each container.

But, how many containers does Tutor create, and even if the answer is “a bunch” is it really necessary to introduce industrial grade systems like Kubernetes just to spread out the workloads across the servers? I mean, why can’t you just break up the application stack yourself, putting the LMS on one server, the CMS on another, MySQL on another, and so on? Well, you can. And I have. And I’ve written copious articles on how to do exactly that. Kubernetes is simply a more robust way for you to automatically and more efficiently allocate your workloads (LMS, CMS, Forum, etc) across your server instances.

Let’s look at a practical example. The screen shot below is a Kubernetes cluster managing multiple Open edX environments for dev, test, staging, and prod; all for the same client. Each environment is grouped into “namespaces”, and you can see that each namespace contains individual pods (a pod is a collection of one or more containers) for lms, cms, worker threads, forum, miscellaneous jobs, and a handful of system-related pods. To be sure, all of these pods were created by Tutor, in mutually exclusive deployments. I safely deployed each of these environments into the same cluster using Tutor, and with no risk whatsoever of these environments accidentally bleeding into one another. That simply does not happen with Kubernetes. You can see from a cursory review of the “AGE” column that some of these containers have been running for upwards of a year. In fact, this screen shot presents fewer than half of the pods running on this cluster, which in reality is running more than 50 pods right now. The “NODE” column indicates on which AWS EC2 instance each container is running which in point of fact is completely irrelevant to me because Kubernetes takes care of all of this and with absolutely no human intervention whatsoever. If a server node was to fail, which does happen on occasion, Kubernetes would simply spin up a replacement EC2 instance and redeploy the containers.

You could obviously do this without Kubernetes, and I have. However, you’d need at least twice the number of EC2 instances, and your workloads would not be as well-balanced nor as stable, and you’d need to take care of Ubuntu updates and up-time yourself.

So, summarizing with the help of the illustration below. The “Traditional Deployment” would represent an on-premise installation of Open edX to a single physical server, the “Virtualized Deployment” would represent a partially horizontally-scaled installation of the same platform but running on virtual server instances instead of bare hardware, and the “Container Deployment” would represent our Kubernetes cluster above.

But wait, there’s more!

Stability

Kubernetes actually provides you with multiple ways to improve your up-time. Like I mentioned above, Kubernetes will intervene in the event of an all-out EC2 instance failure in which a computing node becomes completely unresponsive. But it also does the same thing if an individual pod becomes unresponsive. In fact, Kubernetes is designed to manage multiple replicas of the same service; as many as you think you might need. For example, you could deploy the LMS service as a collection of six replicas, and Kubernetes