From AWS Lambda to KEDA

Part III of the war-story of how we slashed 80% of our cloud costs - replacing AWS Lambda with a Kubernetes native solution

From AWS Lambda to KEDA
8 mins

Why we replaced AWS Lambdah1

As explained roughly in part 1, letting AWS Lambda go is quite the challenge. The combination with SQS and almost‑instant scaling is hard to beat. Also, since we spent some years using this solution, we had fine‑tuned it to our use case quite a bit. The new solution should at least be at the same level as the one we already had.

This was hard. In part because we had to rethink the whole thing and also let go of the SQS + Lambda paradigm we were so set on — and also because we already had a great solution. We very half‑heartedly wanted to spend time finding another one since we spent so much of our time the last few years fine‑tuning this one.

AWS Lambda is probably one of the best solutions for serverless out there. But… we had to let it go.

Our database already lived in DigitalOcean (DO); we were also planning to move our compute resources into DO managed Kubernetes as well. No point in having Lambda generate ingress/egress useless traffic costs.

In numbers, using SQS + Lambda we processed about 4 million messages daily between 120 queues with different priorities between them, using Laravel + RoadRunner. Click here for the full write‑up.

This actually marks our full circle back to a more traditional infrastructure. So buckle up — there are a lot of things to go through.

The journey to replace AWS Lambdah2

AWS Lambda truly lives up to the hype (well, mostly). It scales fast, effortlessly, and (almost) limitlessly. Combined with SQS, it is tough to beat with a more traditional infrastructure — but definitely not impossible.

So off I went to find some replacements. First one I came across was OpenFaaS — looked very promising. The more I read about it, the better it looked. But I couldn’t understand their pricing, their features, their licensing… Crap. So messy.

Then came OpenWhisk — what a cool project. So. Close. Sadly, it seems pretty tough to configure our own custom thing. Lots of Java bits. No out‑of‑the‑box message broker bridges — found some archived ones. Did not inspire much confidence.

And then… nothing else. Searching for AWS Lambda replacements seemed fruitless.

I kept coming up empty, so I had to really break down what AWS Lambda + SQS does for us.

Quite simply:

Take message from queue and send to worker.
If lots of messages, create lots of workers.

Ok. So I need to scale workers based on queue size. Seems easy enough. A plan was forming:

  • Replace SQS with some other message broker.
  • Make some custom Kubernetes scaling controller for worker deployments that reads queue sizes and scales workers up or down.
  • Use DO node pools to scale based on pod demand, i.e., if pods have no room on any node, just scale the nodes.

For the message broker, I chose RabbitMQ because we used it before and it’s mature. I also like Erlang — yeah.

I also know that RabbitMQ comes with a pretty great set of metrics and has a good API — connecting our custom Kubernetes controller would be easy enough.

RabbitMQ — Helm charth2

What a treat this part was. RabbitMQ now has a Helm chart that you can use to very easily spin up and manage your RabbitMQ cluster inside Kubernetes.

We went for a 3‑node cluster. It should handle our workload just fine. After configuring the chart to pick the nodes we chose for our cluster, it was off to the races.

Love when things are this easy.

Initial Kubernetes custom scalerh2

Using Go, the Kubernetes SDK, and a RabbitMQ library, I was able to make a very basic RabbitMQ‑based custom scaler. It would also poll for a custom ConfigMap for each worker. The custom ConfigMap would have information such as:

  • worker name (i.e., the Deployment name)
  • queues to listen to
  • min/max scale of the worker
  • how many messages trigger the scale‑up
  • the delay for the scale‑down

It only took one evening to build this. It was pretty good, but I did feel like something was off — like I was reinventing the wheel.

It was all ready to go. I spent some time configuring some scripts to also validate if we have a ConfigMap defined for any queue we use.

In the past, I would sometimes forget to define queues that the devs added. Now we added these scripts in the pre‑check phase of our deployment script.

KEDA — a dream come trueh2

KEDA (Kubernetes Event‑Driven Autoscaling) is a Kubernetes component that lets you scale applications based on external event sources, like message queues or metrics, instead of just CPU and memory. It works alongside the Kubernetes Horizontal Pod Autoscaler (HPA).

I found this by browsing the CNCF projects page. I was showing a friend how he can gauge what he can or cannot do with Kubernetes — which was ironic because I had not checked that page that well.

Anyway, my jaw dropped to the floor. It was EXACTLY what I was looking for. It could very easily support RabbitMQ or any other source of metrics.

Each worker would need a Deployment plus a scaling config (ScaledObject trigger config). The ScaledObject YAML for each worker was quite similar to our own custom ConfigMap that we used in our initial Kubernetes custom scaler. So we adapted our YAML custom ConfigMap to act as the source for a small generator script that generated the required files for all the workers we would need.

Mind you, we were one week away from releasing our initial Kubernetes custom scaler when we decided to switch it with KEDA. Even though we only had a week, the setup was so straightforward and our tests were looking so good. We moved forward very fast.

But this is not the end of the story. We could now scale workers based on the number of messages on the queues, which is great. But how well do our workers perform?

Moving from SQS to RabbitMQ also has some advantages — that we fully intend to use.

New worker architectureh2

Not going to go into much detail about why the basic way of running Laravel workers is so bad since I already wrote about it and our better worker architecture on AWS Lambda here.

We now had an opportunity to change our worker architecture once more. The main reason being that we now had a distinct advantage using RabbitMQ — we could now stop polling entirely.

In fact, not polling RabbitMQ for each new message is the recommended way. You basically have your consumer “subscribe” to a queue (in RabbitMQ land it’s called basic_consume) and then the broker sends you the new messages as they come. Quite efficient — and also helps with the resource use of the broker. Even having real‑time data on which consumers are active and which are not is pretty great.

We could’ve used RoadRunner for this — but there was a particular nagging reason. The build system of the plugin was always so weird (also explained in the link above). Every time I would rebuild the plugin, even with the same tags, something would fail.

So we decided to use Swoole for this going forward. Since first trying it out a couple of years ago, I became more acquainted with it. Confusing documentation aside, the source code was pretty readable and I could soon build a prototype of the new worker architecture.

The way we envisioned it was like this:

Architecture Diagram
  1. We would have a main PHP process that would get messages from the queues via basic_consume in an async manner.
  2. The messages would get pushed to some SysV message queue that PHP workers would listen to. Inside the message we would also have a special result queue name unique to each message.
  3. The PHP processes would concurrently try to fetch their next message from the SysV message queue. Their result would be pushed to their unique result queue.
  4. Whenever the main process would get a response via each of the result queues, it would take control of the main “thread” (green thread, basically) and respond to RabbitMQ.

This way, the RabbitMQ operations would be run on the same thread in the same process. Quite important since most PHP libraries expect this and have undefined behavior otherwise.

All this, using Swoole.

Works amazingly well and entirely from PHP. Just a pecl install swoole command away. Not bad.

But is it good?h3

I would say so, yes. After some experimentation, we wound up configuring each worker to have three PHP worker processes; we gave them limits of 256 MB RAM and 0.1 CPU. Rarely do we get an OOM.

We are also able to sustain all of the previous Lambda message processing (about 4 million messages daily) with only about five nodes (2 vCPU and 4 GB RAM). This is quite impressive.

It did take a bit of tinkering to get to these numbers, but now we have an elastic system that can handle our current and future workloads.

Another advantage is that our deploy script is basically: kubectl apply. No more deploying separately the lambdas.

Final resultsh2

So to recap:

  1. We moved the database from AWS RDS Aurora to DO Managed DB.
  2. We re‑envisioned our worker architecture to work inside a Kubernetes cluster.

Before, we had a cost of $10k USD monthly on AWS.

Now, we have a cost of $2k USD monthly on DigitalOcean while handling the same workloads.

Out of the $2k we spend:

  • $1k is the HA DigitalOcean MySQL cluster.
  • $1k is the Kubernetes + compute (about 22 nodes in total) + volumes.

Talk about a reduction! We are very happy with this. Of course DO has some quirks, but it is simple, effective and works for us.