Created: 5/19/2022; Last Updated 9/14/2022
Building a Kubernetes cluster for hobby projects and web apps has been a great experience for learning. When things are working well, the development experience is fantastic. You can build an experience for yourself that you'd otherwise need to turn to a Platform as a Service provider like Heroku for, while also being in full control of the stack and the workloads that you run.
I already wrote another post about how I built my cluster, so look there for details on how to get a cluster up and running. This post is about how I develop hobby apps which are deployed to my cluster.
The goal here is for your source code to swiftly and easily go from:
Having top-to-bottom ownership of your web apps is a hugely educational experience, and speaking from being over the hump of moving my projects and workflow to Kubernetes, it was worthwhile. I love not having to do manual deployments anymore while also controlling the whole system from the hardware level.
Don't dig into this unless you already have a great understanding of containers (Docker), are ready to learn a new Infrastructure as Code tool (Terraform), and have a reason motivating you to put in work towards achieving something similar to Heroku. Overall, it took me months to go from zero knowledge of Kubernetes or cloud native application development in general, to the point that I have a reliable system that I enjoy using. Plus, it's likely that you'll hit some significant pain points along the way.
Hopefully this is all review, but you can't have Kubernetes without containers. Remember that Docker and containers are not the same. Docker is a tool for creating containers. containerd, the underlying container runtime, is used by both Docker and Kubernetes (although Kubernetes supports other container runtimes as well). This is why you can build containers with docker and run them in your Kubernetes cluster, even though Kubernetes knows nothing of Docker.
The local development process, therefore, should be familiar to you from
working with Docker and docker-compose. For example, this is a simplified
version of the docker-compose.yml
for this site:
services:
web:
build: .
ports:
- "8000:8000"
- "8002:8002"
- "5555:5555"
entrypoint:
- yarn
- dev
volumes:
- .:/app
- node_modules:/app/node_modules # see https://stackoverflow.com/a/38601156
links:
- db
environment:
DATABASE_URL: postgresql://app:app@db/app
ADMIN_PASSWORD: supersecret
db:
image: postgres:14
environment:
POSTGRES_USER: app
POSTGRES_PASSWORD: app
POSTGRES_DB: app
volumes:
node_modules:
As far as docker-compose.yml
files go, it's extremely straightforward. None
of the "secrets" here are truly secret because it's just for development. By
the way, the node_modules volume trick is a nifty way to mount everything
except node_modules (stack overflow).
I really like that I can just run docker-compose -d
to run the development
process in the background, and just start developing anytime by changing a
file.
For CI/CD I've enjoyed using Terraform. Terraform is an Infrastructure as Code
tool that has broad support for multiple cloud providers, and also support for
Kubernetes. Terraform isn't strictly necessary for use with Kubernetes since
Kubernetes has a lot of CI/CD tooling options, but I use it because I like HCL
much better than YAML, I like having easily reproducible infrastructure, and I
like that I can make changes to Kubernetes deployments through code changes and
let CI/CD take care of the rest. Most of my projects also have a Makefile
to
glue together docker, terraform, and application CLIs.
I like Terraform because it integrates with almost everything. For example, YAML manifests are fine until your app needs an S3 bucket for some reason. You could create the S3 bucket manually, but with Terraform, you don't need to! Just connect your AWS account and let Terraform coerce the bucket into existence for you. Plus, you never just need an S3 bucket. At minimum, you need an IaM user to access that bucket, ARN policies to give that user access to the bucket, and finally the bucket itself. That's 3 resources needed just for simple cloud storage!
Terraform easily soaks up that complexity by allowing you to define what you need just once and then have it be forever repeatable. Plus, if you setup Terraform to run in CI/CD, all you need to do is change your Terraform code, push a new commit, and Terraform will change the infrastructure for you in CI.
Terraform is also nice because you can modularize code. There are so many
repetitive patterns in cloud native apps like the S3 bucket example from
before. Most of my personal projects have two components: a web application
container in some language or framework, and a database at the backend to store
the data. Personally, I always use PostgreSQL, which is another point of
repetition ripe for automation. I was able to effectively factor all this
repetition into Terraform modules that I published on the Terraform
registry. This simplified my apps to the
extent that the docker-compose
files for development are shorter than the
Terraform configuration used in production. For example, if you remove all the
initialization boilerplate from the IaC for this website, this is the whole IaC
file:
resource "random_password" "contact_inquiry_secret" {
length = 48
special = false
}
data "external" "git_describe" {
program = ["sh", "-c", "echo '{\"output\": \"'\"$(git describe --tags)\"'\"}'"]
}
module "basic-deployment" {
source = "jdevries3133/basic-deployment/kubernetes"
version = "0.2.0"
app_name = "jdv"
container = "jdevries3133/jackdevries.com:${data.external.git_describe.result.output}"
domain = "jackdevries.com"
extra_env = {
ADMIN_PASSWORD = random_password.contact_inquiry_secret.result
}
}
As you see, I'm using my own "basic-deployment"
module. This module is
suitable for any Kubernetes cluster, so check it out on
github
or the terraform
registry.
I also have a module for deploying static web app containers without a database
(github
or terraform
registry).
You will also notice a reference to an external script to get a description of
the current commit from git. If you tag commits to track versions (you should)
git describe --tags
becomes excellent place to derive identifiers for each
commit. It's more descriptive than just a commit hash because it includes the
most recent tag, number of commits since then, and a short hash. If you run the
command on a tagged commit, it just outputs the tag. However, note that it
won't work if there are not any tags in version history, like if you do a
shallow clone.
Terraform's external
provider can run any script as long as it outputs JSON.
A short Makefile is the cherry on top that brings all of this together. In my
projects, I usually have three rules: check
, push
, and deploy
.
This website is open source, so you can take a look at its Makefile as an example.
Notice the pattern of changing behavior based on whether the CI environment variable is defined. Several CI/CD solutions set this environment variable when code is run in the CI system, so you can hook into that to do initialization or run programs in a non-interactive mode.
.PHONY: deploy
deploy:
ifdef CI
terraform init -input=false
endif
terraform apply -auto-approve
check
RuleIt is nice to have a check
rule that is higher-level than "test," or "lint,"
specifically. The only distinction that might be useful is to be able to switch
between fast checks and slow checks, since integration tests can become very
slow as they grow. Either way, I find a big point of frustration with CI is
when the CI server's behavior drifts away from the way I check my code locally.
For example, maybe it checks formatting but I forget to do that. A Makefile
can create a common interface through which the CI server can check code, and I
can also check code locally. Keeping the actual CI-specific script as minimal
as possible and putting all logic into the Makefile means that you never have
weird CI server behavior that doesn't match what you see locally.
This is the check
rule for this website:
.PHONY: check
check:
ifdef CI
yarn install
terraform init -backend=false -input=false
endif
terraform fmt -check
terraform validate
yarn prettiercheck
yarn typecheck
yarn test run
ifdef CI
# delaying this for CI means we'll get quick feedback if a quicker check
# fails
docker-compose up -d
endif
make wait # this just waits for a response on `localhost:8000`
yarn cypress
Any CI/CD solution that lets you run these commands in the cloud will make your development experience very pleasant. I feel like the moment that my deployment process went from a 5-minute local process to a process I didn't need to think about (just push the code), the whole Kubernetes learning journey became worthwhile.
Here is an example of a GitHub action file for this project. Again, the goal is
for it to be a super minimal wrapper around the Makefile
, just providing
environment variables where needed & defining the order in which jobs should
run:
name: CI/CD
on:
push:
branch: ['main']
pull_request:
jobs:
ci:
name: continuous integration checks
runs-on: ubuntu-latest
environment: Kubernetes
steps:
- uses: actions/checkout@v2
- name: make check
run: make check
push_container:
name: build and push Docker container
# this means that this job only runs for commits that land on the main
# branch
if: ${{ github.ref == 'refs/heads/main' }}
runs-on: ubuntu-latest
environment: Kubernetes
steps:
- uses: actions/checkout@v2
with:
fetch-depth: 0
- name: login to docker hub
uses: docker/login-action@v1
with:
username: jdevries3133
password: ${{ secrets.DOCKERHUB_TOKEN }}
- name: push container to Docker Hub
run: make push
deploy:
name: deploy to Kubernetes with Terraform
# this means that this job only runs for commits that land on the main
# branch
if: ${{ github.ref == 'refs/heads/main' }}
runs-on: ubuntu-latest
needs: [ci, push_container]
environment: Kubernetes
steps:
- uses: actions/checkout@v2
with:
fetch-depth: 0
- name: setup kubectl
run: |
mkdir ~/.kube
echo "${{ secrets.KUBE_CONFIG }}" > ~/.kube/config
- name: perform deployment
env:
# credentials are used for terraform state bucket
AWS_ACCESS_KEY_ID: ${{ secrets.AWS_ACCESS_KEY_ID }}
AWS_SECRET_ACCESS_KEY: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
run: make deploy
fetch-depth: 0
This argument passed to the actions/checkout@v2
step causes the whole git
history to be cloned. You want to ensure that the most recent tag is in history
for CI, since the Docker container is going to be tagged according to the
output of git describe --tags
. If you only pull the latest commit, there will
be no tags in history, and git describe --tags
will fail – causing your whole
pipeline to fail.
"${{ secrets.KUBE_CONFIG }}"
For your Kubernetes config or any YAML file passed as a secret, you need the quotes around the secret, otherwise GitHub actions will collapse the whitespace and it won't be valid YAML anymore.
Make is ancient and carries a lot of weird quirks with it. In the future, I'm interested in trying out just instead of make.