From "It Works" To "It Scales"

Separate workers now work on a subset of all the users' timelines for all the users who signed up.

From "It Works" To "It Scales"

Well... it scales slightly better, at least ๐Ÿ˜…

TL;DR; Separate workers now work on a subset of all the users' timelines for all the users who signed up.

I recently got to a point where combine.social started running out of memory. What that meant for users was that at some peak hours, they may not have gotten updates for remote replies in their timelines.

This was not ideal, obviously, so something had to happen. Luckily, I had anticipated that this time would come from the onset of the project. My chosen solution was a mono-repo structure where I could split some of the functionality out of the API service into shared packages, and some of the functionality into separate apps.

Mono-Repo Folder Structure

What this means is that now the folder structure looks like this:

apps
โ”œโ”€โ”€ api
โ”œโ”€โ”€ web
โ””โ”€โ”€ worker
packages
โ”œโ”€โ”€ repository
โ””โ”€โ”€ types
tree -d -L 1 apps packages

Where the API service, in the past, was also handling the actual processing of users' timelines, this part has now been split out into an actual separate microservice.

There used to be a services folder inside apps/api/src:

apps
โ”œโ”€โ”€ api
โ”‚ย ย  โ””โ”€โ”€ src
โ”‚ย ย      โ”œโ”€โ”€ lib
โ”‚ย ย      โ”œโ”€โ”€ routes
โ”‚ย ย      โ””โ”€โ”€ services
...
tree -d -L 3 apps

This would contain all the logic for looping through timelines and fetching all the remote reply URLs.

This of course also means that there now is some shared logic (for loading the auth tokens), which has been moved into a package.

Sharding Tokens

Sharding the auth tokens between several workers was done as naively as possible. There are three parts to it:

  1. Adding a worker_id column to the tokens table.
  2. Ensuring that all new signups get evenly distributed.
  3. Evenly distributing existing tokens.

Step one involved creating a new migration:

alter table tokens add column
  worker_id int not null
  default 1;
migration.sql

Evenly distributing new signups is simple enough:

const workerCount = parseInt(process.env.WORKER_COUNT);
async function getWorkerId(): Promise<number> {
	const tokenCount = await getTokenCount();
	return (tokenCount % workerCount) + 1;
}

Splitting the existing tokens between workers was easy enough too:

update tokens 
  set worker_id = 2
  where id % 2 = 1;

Building Images

The next part is building images. The images are actually built in nearly identical ways. To avoid repetitive Dockerfiles I used Earthly to share all the common build logic (such as multi-stage builds), so building a service becomes (nearly) as simple as:

VERSION 0.7
IMPORT ../.. AS root

all:
  BUILD root+prune --app='web'
  BUILD root+build --app='web'
  FROM root+final --app='web'
  CMD [ "node", "/app/apps/web/dist/index.js" ]
  SAVE IMAGE cyborch/toottail-web:latest

For the API service, I substituted 'web' with 'api' above, and for the worker service I (surprise) substituted with 'worker'.

Deploying

I am running all of this in a Kubernetes cluster (on DigitalOcean - I highly recommend them). There are some details in the Mastodon rate limiting which means that I cannot run multiple copies of the same worker service from the same source IP number. What this means is that I have to ensure that the work pods get deployed on separate nodes. This involves using podAntiAffinity (which you can read a lot more about in the Kubernetes docs). For me, it looks like this:

{{- define "worker" -}}
apiVersion: apps/v1
kind: Deployment
metadata:
  name: {{ .workerName }}
spec:
  selector:
    matchLabels:
      app: toottail-worker
    spec:
      affinity:
        podAntiAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
          - labelSelector:
              matchExpressions:
              - key: app
                operator: In
                values:
                - toottail-worker
            topologyKey: "kubernetes.io/hostname"
      containers:
      - name: worker
        image: cyborch/toottail-worker
        env:
        - name: WORKER_ID
          value: {{ .workerId }}
        ...
{{- end -}}
---
{{- include "worker" (dict "workerName" "toottail-worker-1" "workerId" "'1'") }}
---
{{- include "worker" (dict "workerName" "toottail-worker-2" "workerId" "'2'") }}
worker.yaml

Right now, this works with two workers. Should this scale to a lot of nodes then the two template invocations at the bottom would probably turn into a block that uses ranges.

Future Perspectives

Right now, each user who signs up costs me about ยข50/month, ignoring the one-time overhead of running the cluster and the costs of the database, both of which I would be running anyway.

It's not enough to warrant asking users for money to use the service, but if it starts scaling to thousands of users, then I will need some sort of monetization, just to stay afloat.

I have slowly started to look into open-source funding options, which I would much prefer instead of asking people for money.

But then again, maybe this functionality will make it into the Mastodon core distribution one day, and then I won't even need to keep running this ๐Ÿคž

Mastodon