homeposts

Created: 5/19/2022

My Kubernetes-at-Home Workflow

my cluster, three desktop computers side by sideMy humble Kubernetes cluster at home

Building a Kubernetes cluster for hobby projects and web apps has been a great experience for learning, but the development environment when things are working well is the best thing about Kubernetes. You can build an experience for yourself that you'd otherwise need to turn to a Platform as a Service provider like Heroku for, while also being in full control of the stack and the workloads that you run.

I already wrote another post about how I built my cluster, so look there for details on how to get a cluster up and running.

Motivations

The Dream

The goal here is for your source code to swiftly and easily go from:

  1. Your personal device
  2. Through a CI/CD pipeline you understand, control, and can change
  3. Into a Kubernetes cluster that you understand, control, and can change

Having top-to-bottom ownership of your web apps is a hugely educational experience, and speaking from being over the hump of moving my projects and workflow to Kubernetes, it was worthwhile. I love not having to do manual deployments anymore, and the tools that support my cluster give me peace of mind and help me to better understand how my web apps are being used.

The Reality

Don't dig into this unless you already have a great understanding of Docker, are ready to learn a new Infrastructure as Code tool (terraform), and have a reason motivating you to put work in to achieving something similar to Heroku.

Local Development Workflow

Hopefully this is all review, but you can't have Kubernetes without containers. Remember that Docker and containers are not the same. Docker is a tool for creating containers. containerd, the underlying container runtime, is used by both Docker and Kubernetes (although Kubernetes supports other container runtimes as well). This is why you can build containers with docker and run them in your Kubernetes cluster, even though Kubernetes knows nothing of Docker.

The local development process, therefore, should be familiar to you from working with Docker and docker-compose. For example, this is a simplified version of the docker-compose.yml for this site:

services:
  web:
    build: .
    ports:
      # http server
      - "8000:8000"
      - "8002:8002"
      - "5555:5555"
    entrypoint:
      - yarn
      - dev
    volumes:
      - .:/app
      - node_modules:/app/node_modules
    links:
      - db
    environment:
      DATABASE_URL: postgresql://app:[email protected]/app
      CONTACT_INQUIRY_PASSWORD: pass
  db:
    image: postgres:14
    environment:
      POSTGRES_USER: app
      POSTGRES_PASSWORD: app
      POSTGRES_DB: app
volumes:
  node_modules:

As far as docker-compose.yml files go, it's extremely straightforward. None of the "secrets" here are truly secret because it's just for development. By the way, the node_modules volume trick is a nifty way to mount everything except node_modules (stack overflow).

Obviously, if you're doing microservices, using other services like Redis or RabbitMQ, or doing any number of other things with your app, docker-compose can easily scale to support that use case. Best of all, I really like that I can just run docker-compose -d to run the development process in the background, and just start developing anytime by changing a file. That is such a better workflow than needing to keep track of a terminal to be running my development script.

CI/CD

For CI/CD I've enjoyed using Terraform. Terraform is an Infrastructure as Code tool that has broad support for multiple cloud providers, and also support for Kubernetes. Terraform isn't strictly necessary for use with Kubernetes since Kubernetes has a lot of CI/CD tooling options, but I use Terraform. I also use GitHub Actions for my pipelines, and then each of my projects usually has a Makefile to glue everything together.

Terraform

I like Terraform because it integrates with so many providers. For example, yaml manifests are fine until your app needs an S3 bucket for some reason. You could create the S3 bucket manually, but with Terraform, you don't need to! Just connect your AWS account and let terraform coerce the bucket into existence for you, and you can programmatically create the links between your app and the bucket, like injecting credentials into your application pods.

Terraform is also nice because you can modularize code. Most of my personal projects have two components: a web application container in some language or framework, and a database at the backend to store the data. Personally, that database is always PostgreSQL, which is another point of repetition ripe for automation. With Terraform, I was able to create modules for the use cases that are common for me. This allowed me to factor tons of repetitive code out of my projects and into these modules, and also means that most of my projects have really, really short Terraform files.

For example, if you remove all the initialization preamble from the IaC for this website, this is the whole IaC file:

resource "random_password" "contact_inquiry_secret" {
  length  = 48
  special = false
}

data "external" "git_describe" {
  program = ["sh", "scripts/git_describe.sh"]
}

module "basic-deployment" {
  source  = "jdevries3133/basic-deployment/kubernetes"
  version = "0.2.0"

  app_name  = "jdv"
  container = "jdevries3133/jackdevries.com:${data.external.git_describe.result.output}"
  domain    = "jackdevries.com"

  extra_env = {
    CONTACT_INQUIRY_PASSWORD = random_password.contact_inquiry_secret.result
  }
}

As you see, I'm using my own "basic-deployment" module. Check it out on github or the terraform registry. I also have a module for deploying static web app containers without a database (github or terraform registry).

Overall, the Terraform Kubernetes provider's resources are all structured exactly the same as native Kubernetes manifests, just using the syntax of the Hashicorp Configuration Language (HCL), which is what you see above. I feel that this language is amazing. Per the name, it's obviously purpose built for configuration. The fact that HCL can by dynamic and reference variables while also being declarative makes it much better than YAML in my opinion. There are other YAML solutions that try to solve this through templating; namely Helm and Kustomize, but I think that is straight up madness.

There are no variables in the config above, but one of my favorite little quirks of Terraform is the fact that you declare variables but don't assign them a value until runtime. You can pass the values through a tfvars file, environment variables, or the command line, and if you run terraform from a terminal it'll even kindly prompt you for any values you haven't yet provided. I think this part of Terraform's design so nicely guides you towards doing the right thing with secrets, and it integrates nicely with whatever workflow you want to use Terraform with.

You will also notice a reference to an external script to get a description of the current commit from git. If you tag commits to track versions (you should) git describe --tags becomes excellent place to derive identifiers for each commit. It's more descriptive than just a commit hash because it includes the most recent tag, number of commits since then, and a short hash. If you run the command on a tagged commit, it just outputs the tag. The git_describe.sh script referenced before is a simple one-liner:

echo '{"output": "'"$(git describe --tags)"'"}'

Terraform's external provider, which can run scripts, will play nice with any script that outputs valid JSON.

Makefile

A short Makefile is the cherry on top that brings all of this together. In my projects, I usually have three rules: test, push, and deploy. push ships the container, tests is hopefully self explanatory, and I like deploy to be the default rule which will first run tests, then ship the container, then deploy it with terraform.

Again, I'll use this website's Makefile as an example:

DOCKER_ACCOUNT=jdevries3133
CONTAINER_NAME=jackdevries.com

TAG=$(shell git describe --tags)

CONTAINER=$(DOCKER_ACCOUNT)/$(CONTAINER_NAME):$(TAG)

.PHONY: deploy
deploy: push
ifdef CI
terraform init -input=false
endif
terraform apply -auto-approve

.PHONY: push
push:
docker buildx build --platform linux/amd64 --push -t $(CONTAINER) .

Notice the pattern of changing behavior based on whether the CI environment variable is defined. Several CI/CD solutions set this environment variable when code is run in the CI system, so you can hook into that to do initialization or run programs in a non-interactive mode.

A version of this Makefile is in most of my projects. Using a Makefile saves you from needing to remember these projects, and also alows you to adjust the exact procedure for different projects while continuing to use the same generic rule names. Plus, it makes the yaml file for your GitHub Action or other CI/CD solution shorter and simpler, which I'll discuss next.

Test Rule

This site doesn't have any tests yet, but here is an example of a test rule from my fast grader project.

.PHONY: test
test:
ifdef CI
	docker-compose up -d
	docker exec django_web_1 pytest
else
	@# outside CI, assume the system is already running. Also, attaching an
	@# interactive terminal causes (1) pytest to give colored output, and (2)
	@# pdb to pause at breakpoints
	docker exec -it django_web_1 pytest
endif

In a Django project, for example, tests run against a live database, and you can accomplish this in CI easily with Docker compose.

Once again, we hook into the CI environment variable to do things slightly differently for the remote environment versus local development.

GitHub Actions

Any CI/CD solution that lets you run these commands in the cloud will make your development experience very pleasant. I feel like the moment that my deployment process went from a 5-minute process to a zero second process (just push the code), my whole Kubernetes journey came together, the fog cleared, and I could see it was all worth it.

Here is an example of a GitHub action file for this project:

name: CI/CD

on:
  push:
    branches: ['main']

jobs:
  deploy:

    name: deploy code
    runs-on: ubuntu-latest
    environment: Kubernetes
    steps:
    - uses: actions/[email protected]
      with:
        fetch-depth: 0

    - name: login to docker hub
      uses: docker/[email protected]
      with:
        username: jdevries3133
        password: ${{ secrets.DOCKERHUB_TOKEN }}

    - name: setup kubectl
      run: |
        mkdir ~/.kube
        echo "${{ secrets.KUBE_CONFIG }}" > ~/.kube/config

    - name: run `make all` to push container and deploy via terraform
      env:
        AWS_ACCESS_KEY_ID: ${{ secrets.AWS_ACCESS_KEY_ID }}
        AWS_SECRET_ACCESS_KEY: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
      run: make

Gotchas

fetch-depth: 0

This argument passed to the actions/[email protected] step causes the whole git history to be cloned. You want to ensure that the most recent tag is in history for CI, since the Docker container is going to be tagged according to the output of git describe --tags. If you only pull the latest commit, there will be no tags in history, and git describe --tags will fail – causing your whole pipeline to fail.

"${{ secrets.KUBE_CONFIG }}"

For your Kubernetes config or any yaml file passed as a secret, you need the quotes around the secret, otherwise GitHub actions will collapse the whitespace and it won't be valid yaml anymore.

Makefile

The Makefile really helps with the CI stuff. Remember, you can run the Makefile on your machine (you can even set the CI environment variable to make it behave differently), but you can't run the GitHub action locally. Avoid doing anything other than setup and calling make rules from the GitHub action file. Plus, this makes your CI/CD pipeline more portable, since all you're really doing is setup and running a make rule.

Create & Squash Merge feat/ci-cd Branch

I've never gotten one of these pipelines working without between 5 and 30 garbage commits. Always checkout to a new branch, set your action to run on pushes to that branch, get it to work, then squash merge it back into your main branch for a single clean commit.

# make a new branch
git checkout -b feat/ci-cd
# ... do your work, make careless commits
git checkout main  # (or master)
git merge --squash feat/ci-cd

your thoughts?