homeposts

Created: 5/19/2022; Last Updated 9/14/2022

My Kubernetes-at-Home Workflow

my cluster, three desktop computers side by sideMy humble Kubernetes cluster at home

Building a Kubernetes cluster for hobby projects and web apps has been a great experience for learning. When things are working well, the development experience is fantastic. You can build an experience for yourself that you'd otherwise need to turn to a Platform as a Service provider like Heroku for, while also being in full control of the stack and the workloads that you run.

I already wrote another post about how I built my cluster, so look there for details on how to get a cluster up and running. This post is about how I develop hobby apps which are deployed to my cluster.

Motivations

The Dream

The goal here is for your source code to swiftly and easily go from:

  1. Your personal device
  2. Through a CI/CD pipeline you understand, control, and can change
  3. Into a Kubernetes cluster that you understand, control, and can change

Having top-to-bottom ownership of your web apps is a hugely educational experience, and speaking from being over the hump of moving my projects and workflow to Kubernetes, it was worthwhile. I love not having to do manual deployments anymore while also controlling the whole system from the hardware level.

The Reality

Don't dig into this unless you already have a great understanding of containers (Docker), are ready to learn a new Infrastructure as Code tool (Terraform), and have a reason motivating you to put in work towards achieving something similar to Heroku. Overall, it took me months to go from zero knowledge of Kubernetes or cloud native application development in general, to the point that I have a reliable system that I enjoy using. Plus, it's likely that you'll hit some significant pain points along the way.

Local Development Workflow

Hopefully this is all review, but you can't have Kubernetes without containers. Remember that Docker and containers are not the same. Docker is a tool for creating containers. containerd, the underlying container runtime, is used by both Docker and Kubernetes (although Kubernetes supports other container runtimes as well). This is why you can build containers with docker and run them in your Kubernetes cluster, even though Kubernetes knows nothing of Docker.

The local development process, therefore, should be familiar to you from working with Docker and docker-compose. For example, this is a simplified version of the docker-compose.yml for this site:

services:
  web:
    build: .
    ports:
      - "8000:8000"
      - "8002:8002"
      - "5555:5555"
    entrypoint:
      - yarn
      - dev
    volumes:
      - .:/app
      - node_modules:/app/node_modules  # see https://stackoverflow.com/a/38601156
    links:
      - db
    environment:
      DATABASE_URL: postgresql://app:app@db/app
      ADMIN_PASSWORD: supersecret
  db:
    image: postgres:14
    environment:
      POSTGRES_USER: app
      POSTGRES_PASSWORD: app
      POSTGRES_DB: app

volumes:
  node_modules:

As far as docker-compose.yml files go, it's extremely straightforward. None of the "secrets" here are truly secret because it's just for development. By the way, the node_modules volume trick is a nifty way to mount everything except node_modules (stack overflow).

I really like that I can just run docker-compose -d to run the development process in the background, and just start developing anytime by changing a file.

CI/CD

For CI/CD I've enjoyed using Terraform. Terraform is an Infrastructure as Code tool that has broad support for multiple cloud providers, and also support for Kubernetes. Terraform isn't strictly necessary for use with Kubernetes since Kubernetes has a lot of CI/CD tooling options, but I use it because I like HCL much better than YAML, I like having easily reproducible infrastructure, and I like that I can make changes to Kubernetes deployments through code changes and let CI/CD take care of the rest. Most of my projects also have a Makefile to glue together docker, terraform, and application CLIs.

Terraform

I like Terraform because it integrates with almost everything. For example, YAML manifests are fine until your app needs an S3 bucket for some reason. You could create the S3 bucket manually, but with Terraform, you don't need to! Just connect your AWS account and let Terraform coerce the bucket into existence for you. Plus, you never just need an S3 bucket. At minimum, you need an IaM user to access that bucket, ARN policies to give that user access to the bucket, and finally the bucket itself. That's 3 resources needed just for simple cloud storage!

Terraform easily soaks up that complexity by allowing you to define what you need just once and then have it be forever repeatable. Plus, if you setup Terraform to run in CI/CD, all you need to do is change your Terraform code, push a new commit, and Terraform will change the infrastructure for you in CI.

Terraform is also nice because you can modularize code. There are so many repetitive patterns in cloud native apps like the S3 bucket example from before. Most of my personal projects have two components: a web application container in some language or framework, and a database at the backend to store the data. Personally, I always use PostgreSQL, which is another point of repetition ripe for automation. I was able to effectively factor all this repetition into Terraform modules that I published on the Terraform registry. This simplified my apps to the extent that the docker-compose files for development are shorter than the Terraform configuration used in production. For example, if you remove all the initialization boilerplate from the IaC for this website, this is the whole IaC file:

resource "random_password" "contact_inquiry_secret" {
  length  = 48
  special = false
}

data "external" "git_describe" {
  program = ["sh", "-c", "echo '{\"output\": \"'\"$(git describe --tags)\"'\"}'"]
}

module "basic-deployment" {
  source  = "jdevries3133/basic-deployment/kubernetes"
  version = "0.2.0"

  app_name  = "jdv"
  container = "jdevries3133/jackdevries.com:${data.external.git_describe.result.output}"
  domain    = "jackdevries.com"

  extra_env = {
    ADMIN_PASSWORD = random_password.contact_inquiry_secret.result
  }
}

As you see, I'm using my own "basic-deployment" module. This module is suitable for any Kubernetes cluster, so check it out on github or the terraform registry. I also have a module for deploying static web app containers without a database (github or terraform registry).

You will also notice a reference to an external script to get a description of the current commit from git. If you tag commits to track versions (you should) git describe --tags becomes excellent place to derive identifiers for each commit. It's more descriptive than just a commit hash because it includes the most recent tag, number of commits since then, and a short hash. If you run the command on a tagged commit, it just outputs the tag. However, note that it won't work if there are not any tags in version history, like if you do a shallow clone.

Terraform's external provider can run any script as long as it outputs JSON.

Makefile

A short Makefile is the cherry on top that brings all of this together. In my projects, I usually have three rules: check, push, and deploy.

This website is open source, so you can take a look at its Makefile as an example.

Notice the pattern of changing behavior based on whether the CI environment variable is defined. Several CI/CD solutions set this environment variable when code is run in the CI system, so you can hook into that to do initialization or run programs in a non-interactive mode.

.PHONY: deploy
deploy:
ifdef CI
	terraform init -input=false
endif
	terraform apply -auto-approve

check Rule

It is nice to have a check rule that is higher-level than "test," or "lint," specifically. The only distinction that might be useful is to be able to switch between fast checks and slow checks, since integration tests can become very slow as they grow. Either way, I find a big point of frustration with CI is when the CI server's behavior drifts away from the way I check my code locally. For example, maybe it checks formatting but I forget to do that. A Makefile can create a common interface through which the CI server can check code, and I can also check code locally. Keeping the actual CI-specific script as minimal as possible and putting all logic into the Makefile means that you never have weird CI server behavior that doesn't match what you see locally.

This is the check rule for this website:

.PHONY: check
check:
ifdef CI
	yarn install
	terraform init -backend=false -input=false
endif
	terraform fmt -check
	terraform validate
	yarn prettiercheck
	yarn typecheck
	yarn test run
ifdef CI
	# delaying this for CI means we'll get quick feedback if a quicker check
	# fails
	docker-compose up -d
endif
	make wait  # this just waits for a response on `localhost:8000`
	yarn cypress

GitHub Actions

Any CI/CD solution that lets you run these commands in the cloud will make your development experience very pleasant. I feel like the moment that my deployment process went from a 5-minute local process to a process I didn't need to think about (just push the code), the whole Kubernetes learning journey became worthwhile.

Here is an example of a GitHub action file for this project. Again, the goal is for it to be a super minimal wrapper around the Makefile, just providing environment variables where needed & defining the order in which jobs should run:

name: CI/CD

on:
  push:
    branch: ['main']
  pull_request:

jobs:

  ci:
    name: continuous integration checks
    runs-on: ubuntu-latest
    environment: Kubernetes
    steps:
      - uses: actions/checkout@v2
      - name: make check
        run: make check

  push_container:
    name: build and push Docker container
    # this means that this job only runs for commits that land on the main
    # branch
    if: ${{ github.ref == 'refs/heads/main' }}
    runs-on: ubuntu-latest
    environment: Kubernetes
    steps:
      - uses: actions/checkout@v2
        with:
          fetch-depth: 0

      - name: login to docker hub
        uses: docker/login-action@v1
        with:
          username: jdevries3133
          password: ${{ secrets.DOCKERHUB_TOKEN }}

      - name: push container to Docker Hub
        run: make push

  deploy:
    name: deploy to Kubernetes with Terraform
    # this means that this job only runs for commits that land on the main
    # branch
    if: ${{ github.ref == 'refs/heads/main' }}
    runs-on: ubuntu-latest
    needs: [ci, push_container]
    environment: Kubernetes
    steps:
      - uses: actions/checkout@v2
        with:
          fetch-depth: 0

      - name: setup kubectl
        run: |
          mkdir ~/.kube
          echo "${{ secrets.KUBE_CONFIG }}" > ~/.kube/config

      - name: perform deployment
        env:
          # credentials are used for terraform state bucket
          AWS_ACCESS_KEY_ID: ${{ secrets.AWS_ACCESS_KEY_ID }}
          AWS_SECRET_ACCESS_KEY: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
        run: make deploy

Gotchas

fetch-depth: 0

This argument passed to the actions/checkout@v2 step causes the whole git history to be cloned. You want to ensure that the most recent tag is in history for CI, since the Docker container is going to be tagged according to the output of git describe --tags. If you only pull the latest commit, there will be no tags in history, and git describe --tags will fail – causing your whole pipeline to fail.

"${{ secrets.KUBE_CONFIG }}"

For your Kubernetes config or any YAML file passed as a secret, you need the quotes around the secret, otherwise GitHub actions will collapse the whitespace and it won't be valid YAML anymore.

Makefile

Make is ancient and carries a lot of weird quirks with it. In the future, I'm interested in trying out just instead of make.