July 13, 2019

SupercellNX #0

For the past few years, I’ve been working on an intermittent research project. My hypothesis is this: it’s possible to create a CPU description from which you can generate disassemblers, decompilers, interpreters, recompilers, and more. A single CPU description could be used for any number of independent projects, without all the bullshit that typically comes with working with machine code; you just get to write the part that makes your project different.

I started off with extremely high-level, generic code (something usable by many different architectures), and eventually decided to specialize things. The first fruits of this were a fork of the Beetle/Mednafen PSX core from Retroarch, which autogenerated an interpreter and recompiler (using libjit) from a single file description of the MIPS core. This used LLVM Tablegen with a custom language embedded within it, and a giant (very hideous) Python script did all the generation from there. I later reused this in my SharpStation project, which is a from-scratch PlayStation emulator for .NET Core; instead of using libjit to recompile the code, I recompiled straight to CIL and let the .NET JIT handle all the details. Unfortunately, PlayStation’s retro nature makes a highly-compatible recompiler extremely difficult, so this project hit a bit of a wall after getting through the BIOS and into Crash Bandicoot (which was awesome!)

SupercellNX is an emulator for the Nintendo Switch that uses my latest approach: the entire Aarch64 architecture is defined in my own LISP-1 language. Each instruction definition is split into a couple separate pieces: a bitmap that defines static bit values and variable fields, a disassembly template, names for each of the instruction decoding fields, any decoding logic, and then the actual steps to execute the instruction. Here you can see the definition for ADRP, which generates an address relative to the current PC:

This is currently being converted into 4 different things within SupercellNX: a disassembler, an interpreter, a recompiler to CIL, and a recompiler to LLVM IR. Due to the lack of runtime code generation or modification on the Switch, we can heavily cache this code as well as perform background optimizations on hot spots. Adding the (still very buggy) LLVM backend took about 16 hours total, which was only possible due to the flexibility of this system.

I feel really good about what I’ve accomplished with this so far and I know that this is only the beginning. Currently a pretty substantial part of Super Mario Odyssey runs – it gets through all the standard startup code, into main, and even gets through GPU and input setup now. Hopefully in future updates I’ll be able to show it actually getting in-game! For now, though, the next big steps are:

1) Get the LLVM backend up to parity with the CIL recompiler
2) Hotspot optimize code with LLVM
3) Cache recompiled code to disk
4) Properly implement synchronization primitives in the HLE Switch OS

Happy hacking,

- Cody Brocious (Daeken)

Kudos

SupercellNX #0

Now read this

Pipeline Statistics