Use Terraform to create a production-ready Kubernetes cluster on AWS EKS

Note: Source Code for this video tutorial is available in this “Github repository“.

Proudly Brought To You by @FullStackWithLawrence on YouTube

All production Kubernetes clusters that I have created thus far in my career share a common set of requirements for security, scaling, monitoring and configuration options, regardless of the application software running in the cluster. This video series demonstrates how i approach the life cycles of each of these requirements along with my rationale for why I’ve evolved into the set of tools that I use.

I’ve used this stack for a variety of use cases, including for the Open edX online learning management system, a white-label WordPress hosting platform, Wolfram Application Server, and various custom Django applications running at scale.

The Production-Ready Kubernetes Stack Consists of the Following

Follow this 3-part Youtube video tutorial to stand up a production-ready Kubernetes cluster in around 45 minutes. This tutorial includes fully automated Terraform code and instructions to for setting up your local environment with all required software. The Terraform code will create and install the following:

  • AWS Virtual Private Cloud (VPC). This not only provides better security and performance, but it will also encapsulate and isolate all of the resources that this tutorial creates, helping you to ensure that all resources are completely destroyed afterwards.
  • AWS Route53 Hosted Zone and DNS records for your primary domain/subdomain. Terraform will create DNS records that map to the AWS EC2 Classic Load Balancer created by the Nginx Ingress Controller helm chart. Additionally cert-manager will rely on this Hosted Zone to create challenge records for LetsEncrypt-generated TLS/SSL certificates.
  • All AWS Security Groups for internal in-cluster firewall rules. Creating security rules the right way is tedious, and you face potentially catastrophic risks if these are not implemented correctly, so automating the implementation and management of these is a pretty big deal. These security groups are part of AWS’ Terraform AWS EKS module, published to the official Terraform Registry. And to a lesser extent I’ve supplemented these with additional ingress rules that I added to their module.
  • All AWS IAM roles and policies. These are also implemented and managed via AWS’ Terraform AWS EKS module. Add ditto regarding my comments about these risks of trying to implement these on your own.
  • Kubernetes cluster running the latest stable version of Kubernetes (version 1.27 as of this publication date)
  • EKS Managed Node Group preconfigured to run EC2 spot-priced instances, which will reduce your compute costs by 67%. Managed Node Groups are a proprietary feature of AWS EKS. These provide you with a means of tailoring the kinds of EC2 compute resources that your cluster will use for various use cases. Importantly, the Managed Node Group in this example demonstrates to use EC2 spot-priced instances, which typically save you around 67% on your compute cost for your cluster. That is to say that this one feature essentially pays for the cost of using the AWS EKS service, and so this also is a pretty big deal.
  • EKS add-on for Elastic File System (EFS). I’ve provided this as an optional additional Terraform module, for use cases that require “file server” type multiple client access to file systems. You would need this for example, if all of your running compute nodes (ie EC2 instances) require access to a single drive volume. Implementing EFS is non-trivial and also not particularly well documented by AWS, so this code might help you enormously.
  • EKS add-on for Elastic Block Store, which is newly required as of version 1.24 and non-trivial to setup. Unlike other EKS Add-ons, EBS has additional IAM policy and role requirements that must be separately created and so once again, these code samples might be especially helpful to you.
  • EKS Container Networking Interface (CNI) add-on. The Amazon VPC CNI plugin for Kubernetes add-on is deployed on each Amazon EC2 node in your Amazon EKS cluster. The add-on creates elastic network interfaces and attaches them to your Amazon EC2 nodes. The add-on also assigns a private IPv4 or IPv6 address from your VPC to each Pod and service.
  • EKS kube-proxy add-on. Kube-proxy maintains network rules on each Amazon EC2 node. It enables network communication to your pods. For more information see kube-proxy in the Kubernetes documentation.
  • EKS coreDNS add-on. CoreDNS is a flexible, extensible DNS server that can serve as the Kubernetes cluster DNS. When you launch an Amazon EKS cluster with at least one node, two replicas of the CoreDNS image are deployed by default, regardless of the number of nodes deployed in your cluster. The CoreDNS Pods provide name resolution for all Pods in the cluster.
  • Helm-installed Kubernetes Vertical Pod Autoscaler (VPA). VPA monitors your pods’ usage of cpu and memory in real-time, and makes adjustments to these metrics as necessary to ensure that your compute resources are optimally allocated across the pods running on your cluster. This is a pretty big deal in terms of ensuring that your cluster is running cost effectively.
  • Helm-installed Nginx Ingress Controller. Unlike the Nginx Docker container that is regularly used as a “hello world” kind of Kubernetes deployment example, the Nginx Ingress Controller also implements all other cloud resources that you’ll need in order to setup http url end points for the services that run in your cluster. This is a non-trivial installation that includes load balancing, ssl certificate management, auto scaling policies, and various security considerations. The Nginx Ingress Controller needs to recognize and cooperate and coordinate with various other major subsystem that coexist in your cluster, which is all to say that this — once again — is a pretty big deal and might help you a lot.
  • Helm-installed Karpenter, to assist in the management of spot-priced EC2 instances. Karpenter is an open source project that ironically was originally created by AWS engineers to support the amazon.com and AWS web sites. It is now community supported and collectively helps companies save millions of dollars annually in cloud computing costs. It previously was particularly challenging to install and configure, until the community published the Helm Chart that this code sample uses.
  • Helm-installed Metric Server. Metrics Server is a scalable, efficient source of container resource metrics for Kubernetes built-in autoscaling pipelines. Metrics Server collects resource metrics from Kubelets and exposes them in Kubernetes apiserver through Metrics API for use by Horizontal Pod Autoscaler and Vertical Pod Autoscaler. Metrics API can also be accessed by kubectl top, making it easier to debug autoscaling pipelines.
  • Helm-installed Prometheus. Prometheus offers an open-source monitoring and alerting toolkit designed especially for micro services and containers. Prometheus monitoring lets you run flexible queries and configure real-time notifications.
  • Helm-install cert-manager. cert-manager adds certificates and certificate issuers as resource types in Kubernetes clusters, and simplifies the process of obtaining, renewing and using those certificates. It can issue certificates from a variety of supported sources, including Let’s Encrypt, HashiCorp Vault, and Venafi as well as private PKI.

Setting Up Your Local Dev Environment

The Terraform code depends on a LOT of software that should be installed on your local computer and ready to go, including:

  • AWS CLI. The AWS Command Line Interface (AWS CLI) is an Amazon Web Services tool that enables developers to control Amazon public cloud services by typing commands on a specified line. AWS’ version of a command-line interface is one of several methods a developer can use to create and manage AWS tools. You not only need to have this installed, but also configured with an IAM key-pair for an IAM user with sufficient admin-level permissions to your AWS account.
  • kubectl. kubectl is the Kubernetes-specific command line tool that lets you communicate and control Kubernetes clusters. Whether you’re creating, managing, or deleting resources on your Kubernetes platform, kubectl is an essential tool. kubectl is the sole means of communicating with a Kubernetes cluster. Any alternative software application — k9s for example — is simply a GUI that is making calls to/from kubectl.
  • Terraform. Terraform is an open-source infrastructure-as-code software tool created by HashiCorp. Users define and provide data center infrastructure using a declarative configuration language known as HashiCorp Configuration Language. Terraform is extensible, and all major cloud infrastructure platforms publish and maintain a Terraform provider. This repo for example, extensively uses the Terraform AWS Provider. Terraform is also extensible by technology. For example, providers exist for both Kubernetes and for Helm.
  • Helm. Helm is a tool that automates the creation, packaging, configuration, and deployment of application software to Kubernetes by combining your configuration files into a single reusable package.
  • k9s. K9s is a terminal based UI to interact with your Kubernetes clusters. The aim of this project is to make it easier to navigate, observe and manage your deployed applications in the wild. K9s continually watches Kubernetes for changes and offers subsequent commands to interact with your observed resources.

I installed all of these packages to my MacBookAir using Homebrew, which is itself something that you should consider installing if you haven’t already.

Good luck on next steps!! I hope you found this helpful. Contributors are welcome. My contact information is on my web site. Please help me improve this article by leaving a comment below. Thank you!