How I switched from classic hosting to Kubernetes

If you're just interested in a quick howto on setting up a Kubernetes + nginx + certbot + load balancer, then skip to making the switch.

In switching to Kubernetes, I have read a lot of different guides and howtos. Many of them only work partially, and all of them only solves part of the problem. I have tried to show all the steps needed to deploy a stack.

I've done sys admin stuff for my own deployed solutions for many years. I found the cloud provider I liked and set up my cli to make all the common tasks relatively simple. I taught myself apache configuration and eventually nginx configuration when that came around. I read up on all the best practices for securing your servers. I had lengthy iptables rule sets for all kinds of scenarios, which were eventually replaced by ufw.

Each of these things themselves had some level of complexity, but they were relatively simple in the sense that they were built to do one thing and do it well. There was one way to provision server instances and another way to configure web servers.

At some point, containerizing services became a lot easier with Docker and that was just one more thing to learn to deploy my services. With docker-compose, things were actually relatively easy. I could describe all the services that was needed for one solution and easily provision them both locally for development and testing, and remotely for staging and production.

Then, at $work someone started using Kubernetes and started pushing for everyone to make the switch, so I felt it prudent to move all my own hosted services too just to keep up to date.

Kubernetes wraps all of the things mentioned above into one single type of configuration. Everything is now described in one all-encompassing way. That obviously means that there is a lot to learn in one go, and no relatively flat learning curve as was the case with each of the individual technologies that I used to use.

Obviously, that is going to be more complex than each of the individual parts, and probably also more complex than the sum of the parts.

To get started, let's have a look at what really went into configuring all the individual parts.

The details of a classic hosting setup

First off: I host all my own stuff with DigitalOcean. It was the most inexpensive solution when I started out. It is super quick and simple to get started. It takes less than a minute to get a server up and running and I don't need a search button to find my way around their dashboard. So, this is based on the assumption that you are using DigitalOcean too.

I used to have a few instances running docker swarm where I could start services as needed. I had them set up so I could remotely deploy services described in a docker-compose.yml file.

Installation was relatively easy. Create an instance with Docker from the marketplace.

In reality, I used docker-machine to set everything up from the command line. Firstly, install docker-machine on your local host:

curl -L https://github.com/docker/machine/releases/download/v0.16.0/docker-machine-$(uname -s)-$(uname -m) \
  >/usr/local/bin/docker-machine
chmod +x /usr/local/bin/docker-machine

Then create the instance:

docker-machine create --driver digitalocean \
  --digitalocean-image ubuntu-18-04-x64 \
  --digitalocean-access-token <docker-access-token> \
  --digitalocean-region <region> <name>

Then I would ssh into the machine and initialize it as a Swarm instance with:

docker swarm init

I had two small helpers to allow me to switch between local and remote deployment:

# Activate a named docker machine
docker-machine-set-active() {
  eval $(docker-machine env $1)
}

# Switch to local docker machine
alias docker-machine-local="unset DOCKER_TLS_VERIFY;unset DOCKER_HOST;unset DOCKER_CERT_PATH;unset DOCKER_MACHINE_NAME"

Assuming that I had a project folder with a Dockerfile, then I could create a docker-compose.yml file to deploy both locally and remotely:

version: '3.7'
services:
  my-service:
    image: my-service
    build: ./service
    ports:
      - 8080:8080

Because the server instance was persistent, then storing persistent data from one deployment to the other was done simply by adding a volume statement to the compose file:

version: '3.7'
services:
  my-service:
    image: my-service
    build: ./service
    ports:
      - 8080:8080
    volumes:
      - "/var/my-files:/var/my-files"

With this, files stored in /var/my-files would be stored in the host file system and be mounted into the container on deployment.

Deploying locally was easy:

docker-compose up

Deploying remotely was easy too:

docker-machine-set-active my-server
docker-compose build
docker stack deploy -c docker-compose.yml my-service

This would set the active docker host to be the remote machine, copy the build files (build context) to the remote machine, build there, and deploy the built images. I could have used an external docker registry to store the built images instead of building remotely, but this way I had one less external dependency.

Of course, then I would need to set up my nginx to terminate TLS connections and reverse proxy requests. Firstly, to install nginx, ssh into the server and run a few commands:

apt-get update
apt-get install -y nginx

Next, I added a configuration in /etc/nginx/sites-enabled/my-site.conf:

server {
    listen 80;
    server_name: myserver.example.com;
    location / {
        proxy_pass http://localhost:8080;
    }
}

Next, install certbot and configure nginx:

apt-get install software-properties-common
add-apt-repository universe
add-apt-repository ppa:certbot/certbot
apt-get update
apt-get install -y certbot python3-certbot-nginx
certbot certonly --nginx

It turns out that deploying a service has a lot of moving parts.

Making the switch

One of the reasons, why each of the steps above didn't seem particularly daunting was that each step was self contained and used one piece of software which did one thing and did it well.

Kubernetes takes all of the steps mentioned above and combines them into one.

First things first, create a cluster:

To use the DigitalOcean cli, an access token is needed. Installation instructions can be found on the DigitalOcean API Guide.

To get started on the client side, install and configure digitalocean and kubernetes cli on the local machine:

brew install doctl kubectl
# Authenticate with DigitalOcean
doctl auth init

You will be prompted to enter the DigitalOcean access token that you generated in the DigitalOcean control panel.

DigitalOcean access token: your_DO_token

Before we can get started with anything on the cluster side we need a new package manager, called helm, installed on our local machine:

brew install helm

Unlike, npm or apt or other pretty much any other package manager out there, helm starts out with nothing so trying to install anything will give you an error along the lines of:

Error: failed to download "stable/some-package" (hint: running helm repo update may help)

Running helm repo update, annoyingly, does nothing at all to help. You have to guess (or google) a url to an official repo to install packages from, and run:

helm repo add stable https://kubernetes-charts.storage.googleapis.com/
help repo update

Now we are ready to start installing common things.

To install nginx, run:

helm repo add ingress-nginx https://kubernetes.github.io/ingress-nginx
helm repo update
helm install ingress-nginx ingress-nginx/ingress-nginx \
  --set controller.service.annotations."service\.beta\.kubernetes\.io/do-loadbalancer-enable-proxy-protocol=true" \
  --set-string controller.config.use-proxy-protocol=true

This will install nginx and enable proxy protocol, which is needed to ensure that the true client ip is available to the services, and not just the load balancer ip.

To enable TLS, we need to install a cert-manager, which is basically like installing  certbot. Installing it has a few more steps:

# Create a name space
kubectl create namespace cert-manager
# Add the cert-manager repository
helm repo add jetstack https://charts.jetstack.io
helm repo update
# Create resource definitions
kubectl apply --validate=false -f https://github.com/jetstack/cert-manager/releases/download/v0.15.0/cert-manager.crds.yaml
# Install cert-manager
helm install \
  cert-manager jetstack/cert-manager \
  --namespace cert-manager \
  --version v0.15.0

The cert-manager needs an issuer which will issue the actual TLS certificates. Create a file called production_issuer.yaml with the following contents:

apiVersion: cert-manager.io/v1alpha2
kind: ClusterIssuer
metadata:
  name: letsencrypt-prod
spec:
  acme:
    # Email address used for ACME registration
    email: <your email address here>
    server: https://acme-v02.api.letsencrypt.org/directory
    privateKeySecretRef:
      # Name of a secret used to store the ACME account private key
      name: letsencrypt-prod-private-key
    # Add a single challenge solver, HTTP01 using nginx
    solvers:
    - http01:
        ingress:
          class: nginx

The cert-manager (and other internal services which need to access the ingress directly) will fail unless the ingress is annotated with a hostname that points to the load balancer's external ip. To fix this run:

kubectl annotate svc nginx-ingress-controller service.beta.kubernetes.io/do-loadbalancer-hostname=<my load balancer hostname>

At this point, we are finally ready to start deploying our service.

Switching from docker-compose to helm charts

This deserves a chapter all on its own, because this is going to have nginx,  certbot, and docker-compose all wrapped into one.

Just to remind ourselves, this is what the docker-compose.yml file looked like:

version: '3.7'
services:
  my-service:
    image: my-service
    build: ./service
    ports:
      - 8080:8080

To convert this to helm, I tried running helm create my-project to get started, but that yields bunch gigantic bloated files with a ton of options which may end up being information overload when you are just starting out. In stead I took some inspiration from a minimal helm starter project.

Firstly, we need a few boilerplate files.

Chart.yaml:

apiVersion: v1
description: My project
name: my-project
version: 1.0.0

values.yaml:

# Default values

And now to the actual contents of my service to be deployed.

templates/deployment.yaml:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: my-project
spec:
  replicas: 1
  selector:
    matchLabels:
      app: my-project
  template:
    metadata:
      labels:
        app: my-project
    spec:
      containers:
      - name: my-project
        image: my-user/my-project
        ports:
        - containerPort: 8080

This is the equivalent of the service name and the image: line in the classic hosting setup.

Notice the image: my-user/my-project line? This assumes that you are willing to push your images to hub.docker.com. This can ge done by creating an account on docker.com and running:

docker login

There is a separate guide to setting up a private docker registry if you need that.

templates/service.yaml:

apiVersion: v1
kind: Service
metadata:
  name: my-service
spec:
  type: ClusterIP
  ports:
  - port: 80
    targetPort: 8080
  selector:
    app: my-service

This maps the port (8080) that the server listens on inside the container to a cluster service listening on a port (80). This is the equivalent of the ports: line in the docker-compose.yml file in the classic hosting setup.

templates/ingress.yaml:

apiVersion: networking.k8s.io/v1beta1
kind: Ingress
metadata:
  name: my-service-ingress
  annotations:
    kubernetes.io/ingress.class: nginx
    cert-manager.io/cluster-issuer: letsencrypt-prod
spec:
  tls:
  - hosts:
    - my-service.example.com
    secretName: my-service-tls
  rules:
  - host: my-service.example.com
    http:
      paths:
      - path: /
        backend:
          serviceName: my-service
          servicePort: 80

This maps the cluster internal service to a public service with TLS configuration. This is the equivalent of the nginx configuration for the reverse proxy in the classic hosting setup.

If I want persistent storage that will need to be created as a volume in DigitalOcean which is done either through the dashboard or through the command line, but I need the command line anyway to get the volume id:

doctl compute volume create my-volume --fs-type ext4 \
  --region <my-region> --size 100GiB 
doctl compute volume list

Make a note of the ID of the newly created volume and use it to configure a persistent volume.

templates/pv.yaml:

kind: PersistentVolume
apiVersion: v1
metadata:
  name: my-volume-pv
  annotations:
    # fake it by indicating this is provisioned dynamically, 
    # so the system works properly
    pv.kubernetes.io/provisioned-by: dobs.csi.digitalocean.com
spec:
  storageClassName: do-block-storage
  capacity:
    storage: 100Gi
  accessModes:
    - ReadWriteOnce
  csi:
    driver: dobs.csi.digitalocean.com
    fsType: ext4
    volumeHandle: <my-volume-id>
    volumeAttributes:
      # Don't format the volume and destroy all data if the 
      # volume is unmounted
      com.digitalocean.csi/noformat: "true"

Next, mount the volume into the container using a persistent volume claim.

templates/pvc.yaml:

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: my-volume-pvc
spec:
  accessModes:
  - ReadWriteOnce
  resources:
    requests:
      storage: 100Gi
  storageClassName: do-block-storage

All of this is the equivalent of the volumes: line in the classic hosting setup.

Finally, you can go and deploy services on Kubernetes, just like you used to on classic hosting your setup.

Retrospective

Switching from classic hosting to Kubernetes based hosting made me realize how many skills I had to aquire to configure all the moving parts in a service setup. It also made me realize that something that is capable of handling all of those moving parts will necesarily be very complex and learning how to use that is going to involve a learning curve which is probably as steep as it would have been to learn all the individual parts of the classic setup in a single sitting, and then some.

Along the way, I tested tools like Kompose which could potentially transform my docker-compose.yml files into helm charts, but ended up failing because they only support a small subset of compose keywords.

I also installed compose on kubernetes on a cluster, but that ended up being relatively expensive for my use-case because it implicitly creates a load balancer for each individual service that is exposing a port.

I even evaluated Fargate which ends up buying into all the hassles of relearning everything (including DNS) the AWS way.

In our team at $work, we are probably going to go the compose on kubernetes route for a while, since that seems to be the better middle ground on keeping the simplicity of having identical docker-compose.yml files on local development and remote deployment, keeping developer productivity high.

In my private setup, I appreciate that I can maintain a single load balancer and a cluster where I can deploy multiple services where certbot and reverse proxy configuration is now a part of the service configuration.