How GameCube/Wii emulator Dolphin got a turbocharge

Sb4e01 19

Something remarkable is happening with Dolphin. The GameCube and Wii emulator has been around for more than a decade now, which is a long time for an emulator to be in active development. It was born as a rough, limited GameCube emulator before growing into a bustling open source project in 2008. Over the past few years, Dolphin has become one of the easiest-to-use emulators ever made, and it’s also one of the only emulators to make many games better. You haven’t played Super Mario Galaxy until you’ve played it at 1440p.

Today, Dolphin is doing something I’ve never seen another emulator project do. This many years into development, most emulator projects have been abandoned, but Dolphin keeps making major, sometimes huge improvements to compatibility and performance. In August, Dolphin’s CPU core emulation saw a 26% boost in performance. In September, another update brought a 16% performance boost to all games, with some specific games seeing boosts of over 100%.

Those are almost unbelievable improvements, and most of them are due to one contributor. She goes by the handle Fiora Aeterna online, and she’s only been contributing to Dolphin for two months.

“Getting involved in Dolphin was a bit nerve-wracking at first; I'd never really contributed to an open source project before,” Fiora wrote in an email when I reached out to talk about the emulator’s recent improvements. “It was an internal conflict for me for many years; on the one hand, there was so much cool open source stuff I wanted to work on, but on the other hand it could be really intimidating (with the 50:1 gender ratio certainly not helping). The inspiration to try out Dolphin actually came from the realization they already had a female team member (Rachel Bryk)—I figured if she found it okay, maybe I should try too? My hope ended up being justified: Dolphin's team was really unusually helpful and friendly, and never seemed like the sort to mock me for having seemingly dumb questions.”

a note on emulation and legality

If you don’t closely follow the emulation scene, you may wonder why Nintendo hasn’t shut down the Dolphin project. The code of the emulator itself is completely legal. It’s written by programmers like Fiora, and none of that code belongs to Nintendo. For its most accurate audio emulation, Dolphin does require a DSP (digital signal processor) dumped from a Wii; downloading that is illegal, but dumping it from your own modded Wii is perfectly legal.
Ripping your own Wii/GameCube discs is legal, but downloading them is definitely not. That's piracy. Don't do it.

Fiora is a programmer by day, and contributes to Dolphin on-and-off in her spare time. Fittingly, she owes her career to another emulator called NO$GBA she discovered as a 10-year-old. She wanted to play Pokemon, but her parents wouldn’t buy her a Game Boy. So she found a way to play it on her computer. And, in the process, grew curious about how emulators work. She started to learn more about programming. Fourteen years later, it’s her day job.

On Dolphin, Fiora has primarily contributed to the emulator’s CPU core. The GameCube and Wii both run on IBM’s PowerPC architecture, which uses a different instruction set than the x86 processors that virtually all PCs run on. Emulating those consoles means converting PowerPC instructions into x86 instructions. This is why emulation can be so demanding, as Dolphin’s FAQ concisely explains: “when emulating, every basic instruction a game runs needs to be translated to something a PC can execute. Depending on the instruction, this can take from 2x to 100x clock cycles, which explains why you need more than a 486MHz CPU to emulate a GameCube.”

Fiora’s broke down the process of CPU emulation in more detail.

“The most basic sort of CPU emulator is an interpreter; it one by one steps through the instructions, parses each one, and calls the appropriate function for that instruction,” she wrote. “Interpreters are also very commonly used for scripting languages (like Python, Ruby, Lua, etc).

“A just-in-time (JIT) compiler takes blocks of code and transforms them into x86 code (recompiling), then executes that. This is way faster—by orders of magnitude! This shows up in web browsers, for example: at first they use an interpreter to run Javascript, then they recompile the most often-used parts with a basic recompiler, and sometimes if a section of code is used a whole lot, they recompile it with a slower, optimizing compiler that generates more efficient code.

"Dolphin isn't quite that sophisticated—it has a single recompiler that runs on all blocks of code. But the recompiler can only transform instructions that it knows how to recompile; otherwise, it has to stop and drop back to the interpreter for that one instruction, and this is really slow. It's totally expected the recompiler has to drop back to the interpreter for some instructions, but ideally those instructions should be very few and far between. This happens in other recompilers too; for example, browser makers advise people not to use certain Javascript constructs because they force a fallback to interpreter mode, and slow things down massively.”

Dolphin Skyward

Following all that so far? Dolphin’s August update states that Fiora’s work on the JIT compiler sped it up by a 26%. In one month! She elaborated that these were general recompiler improvements, which means “better ways of optimizing blocks of code (moving instructions around, combining instructions, and so on) and better ways of implementing individual PowerPC instructions with fewer x86 instructions than before.”

General recompiler improvements make the emulator's baseline performance better, but her other big job, implementing missing instructions, help make individual games run more efficiently. Remember that if the compiler doesn’t recognize an instruction, it has to fall back on the much slower CPU interpreter. Because games consist of millions of CPU instructions, there are many, many opportunities for slowdown. “I've implemented a lot more [missing instructions] in the past two months, putting the x86 recompiler up around 90% of instructions implemented,” Fiora wrote. “The rest are mostly system instructions that are rarely used.”

Fiora’s work on Dolphin has also helped correct some longstanding issues with specific game performance. Sega’s Super Monkey Ball and F-Zero GX, for example, both used an unusual bit of code that almost no other GameCube games use, and as a result that code wasn’t built into the recompiler. She corrected that. She also built on top of the work of another contributor, magumagu, to bring his fixes for physics and collision emulation to the hardware JIT compiler and interpreter.

“magumagu discovered that floating-point multiply operations had slightly odd rounding behavior in certain cases,” Fiora wrote. “Fixing the recompiler to match this behavior cost a small amount of speed, but the effect was tremendous: it fixed ghosts in Mario Kart, replays in Brawl and F-Zero, physics in Zelda, and a whole lot of other things.” As a result, it’s now possible to save replays of games like Mario Kart Wii on a console, copy the file to a PC, and play it back 100% accurately. In HD.

This video explains those bugs and shows how the games perform properly when they're fixed.

Another contributor, comex, added two separate optimizations to Dolphin’s code that increased performance in almost all games by 8%. Each. Between his work and Fiora’s, a number of games have seen huge performance jumps in Dolphin in just the past month:

  • Sonic Colors: 39% faster
  • Star Wars Rogue Squadron II: Rogue Leader: 103% faster
  • F-Zero GX: 110% faster
  • The Last Story: 38% faster
  • Xenoblade Chronicles: 40% faster

Those are incredible jumps, and in the case of an extremely demanding game like The Last Story, could make the difference between a steady framerate and a hitchfest. The performance gains are relative, of course; Rogue Leader, for example, is still problematic because of an issue with Memory Mapping Unit (MMU) code. Its 103% boost is largely due to a new way of dealing with MMU code that Fiora helped program.

Dolphin Smash Cpu

Performance improvements from comex and Fiora's contributions. Image credit: September update.

“Most games on the Wii and Gamecube use the default memory management software, which is easily emulated, but a few do their own custom stuff, which requires implementing (potentially) the full features of this aspect of the hardware,” Fiora explains. “This is painfully slow; up until recently, most MMU games had trouble running at 20fps on a fast CPU!

“Rogue Leader and Rebel Strike in particular have long been emblematic of Dolphin's failures: they're by Factor 5, a developer legendary for taking hardware to its limits and beyond. In the N64 days they rewrote the firmware on the GPU to push 5 times more polygons than it was supposed to; in the Gamecube era they made Star Wars games that used basically every single feature in the book that Dolphin finds difficult to emulate. Making them playable won't just involve lots of continuing CPU optimizations, but also implementing graphics features that no modern card supports a direct equivalent of, like ZFreeze (if it's not implemented, the skybox covers nearly everything in the game and you can't see more than a few feet in front of you!).”

Dolphin still has a long way to go before it flawlessly supports the entire GameCube and Wii libraries—I was disappointed to hear that Metroid Prime’s stuttering issues haven’t been solved with Fiora’s improvements. (Good news, though: other contributors are hacking away at that problem.) But at this point, more performance boosts are hitting Dolphin in a month than most emulators see in a year or two of development.

If you've always seen amazing screenshots from Dolphin and wanted to try it out, now's the time to do it.

Wes Fenlon
Senior Editor

Wes has been covering games and hardware for more than 10 years, first at tech sites like The Wirecutter and Tested before joining the PC Gamer team in 2014. Wes plays a little bit of everything, but he'll always jump at the chance to cover emulation and Japanese games.

When he's not obsessively optimizing and re-optimizing a tangle of conveyor belts in Satisfactory (it's really becoming a problem), he's probably playing a 20-year-old Final Fantasy or some opaque ASCII roguelike. With a focus on writing and editing features, he seeks out personal stories and in-depth histories from the corners of PC gaming and its niche communities. 50% pizza by volume (deep dish, to be specific).