Hacker101 CTF Architecture

Introduction #

The Hacker101 CTF has been up for about two and a half months now, with thousands of users finding tens of thousands of flags. It has been working beautifully – after a few rough days at the beginning – with hundreds of simultaneous instances running in parallel. Getting there, though, was an adventure.

Structure #

The CTF is built on with five notable pieces:

One overall note: every part of this system is built in Python, with the exception of some levels. Much <3 for Python.

Messaging protocol #

I built a custom messaging protocol for this, as I had really specific goals in mind for it. In all likelihood, there is something off-the-shelf that would do what I wanted, but this was simple enough that I figured I would just build it out.

The communication is one-to-one between a client and server, with each message being a pair of JSON-serialized objects: a request and a response. This is strictly in-order, so there’s no need for sequencing or anything of that nature. Connections can be long-running (runner->manager) or short-lived (web->manager). This is probably the simplest part of the whole system, but it is really powerful.

Web frontend #

The web frontend for the CTF is built using a custom framework on top of Flask, to allow for super rapid development and security as a default. It’s a framework I’ve been using since 2012, just carrying it from project to project and adding features. The earliest rev is public and available on GitHub; eventually I’ll actually release it as a standalone framework, though my personal projects are all moving over to Serac, my new-ish web server and app framework for .NET Core.

The frontend speaks to the manager when a user attempts to start or terminate a level instance. When you click the “Go” button on a given level, the frontend sends a message to the manager requesting an instance, the manager sends back a URL, then the frontend redirects the browser to that new instance. Termination works by the same underlying premise, minus the redirect at the end.

Outside of the communication with the manager, the web frontend basically works like any other app.

Runners #

Runners create and maintain the actual Docker containers where our levels run. Each runner box has an nginx instance, a runner daemon, and some number of levels running at any given time.

To start a level instance, the runner receives a message from the manager that contains: the name of the Docker image for the level, the username requesting it, and the list of flag values. It spins up a container with the relevant Docker image, reconfigures nginx to forward from a given URL (e.g. http://35.196.135.216:5001/deadbeef01/) to the webserver running inside the container, and then sends that URL back to the manager.

The runner also receives termination requests from the manager, and watches the nginx logs to see if a given container has been idle for a certain period of time; once it crosses that threshold, the instance gets killed and the manager is notified.

Manager #

This is the most critical component of the infrastructure, by far. All the runners keep open connections to the manager at all times, and the web frontend connects any time the user tries to start or terminate an instance.

The manager keeps track of which users are running levels and where that’s happening. That way, the user can click the ‘Go’ button as many times they want, and the manager only has to send back the URL if it already knows where it is. If it doesn’t already exist, the manager looks for the runner with the fewest active instances, and sends a message to kick off the instance startup.

Summary #

The Hacker101 CTF isn’t the most complex software in the world – the entire thing is less than 3000 lines of Python, and it’s not the densest code out there – but it took engineering it from the ground up to be solid, scalable, and secure. I don’t generally consider myself a programmer anymore, but I’m proud of this codebase.

Happy hacking,

 
53
Kudos
 
53
Kudos

Now read this

SupercellNX #0

For the past few years, I’ve been working on an intermittent research project. My hypothesis is this: it’s possible to create a CPU description from which you can generate disassemblers, decompilers, interpreters, recompilers, and more.... Continue →