What 'optimization' really means in games

Aspects of optimization

Now that we have some idea about some of the most expensive effects in modern games, we can dig into what it takes to create a game that will, hopefully, end up being considered optimized.

Determining presets and system requirements

Graphics presets—the common ‘low’, ‘medium’, ‘high’, ‘very high’, ‘ultra’—are almost never directly comparable across games, but they are very important: they guide gamers who do not want to dive deep and adjust individual settings.

At Croteam, a basic 'medium' target is set early in a game’s development, based on hardware constraints and expectations, and all design and artwork adheres to that standard. Closer to release, the technical team then derives settings for each preset, trying to balance graphical splendor and performance at each level.

What I personally love about their approach is that they categorize each performance option as either CPU-bound, GPU-bound, or memory capacity-bound, rather than leaving it to the user to figure that out. While they try to design presets to accommodate a balanced PC, this approach makes it easier for gamers with a particularly strong or weak GPU or CPU to make effective use of their hardware.

Categorized performance settings from The Talos Principle.

For QLOC, who mostly deal with porting existing console games to PC, the standard console settings usually translate to the 'medium' preset of the PC version, though some aspects might be tweaked to accommodate essential platform differences. Scalability options are then provided to whatever extent is feasible. Both presets and requirements are continuously evaluated throughout the optimization process, starting as soon as the renderer and other core features are tested and working.

How presets and hardware requirements relate is often a bit of an unknown to gamers. While every developer has their own standards, for Croteam 'minimum' specifications mean that the game will run well at low settings, and if those requirements aren’t met full technical support will not be provided. 'Recommended', on the other hand, means that the game can be played as intended ("Not medium: high", as Dean told me) at 1080p resolution.

How we judge optimized games

Armed with more knowledge of what goes into optimization and some of the most expensive effects, we can now try to reconsider some hotly debated examples of 'unoptimized' games.

One relatively recent subject of this debate was Dying Light, and in my opinion it is one of the most damning cases—not due to the developer, but due to the misguided reception it received. Dying Light is an open world game with lots of moving actors and parts as well as a dynamic day/night cycle, all ingredients that make up a technically demanding game. It shipped with a huge range of settings for draw distance in particular, which in this type of experience greatly affects both CPU and GPU load.

Dying Light's impressive draw distance. Screenshot by James Snook.

There was an outcry about the 'terrible unoptimized PC port' when Dying Light would not perform up to (arbitrary) standards at maximum settings. As it turned out, the draw distance slider in the initial version of the game was already above console settings at its lowest position, and went incomparably higher. People were so agitated, in fact, that the developer felt like they had to reduce the range of the slider to 55% of its former maximum in an early patch.

Would the game have been perceived as much more 'optimized' if this trivial step would have been taken before release? I definitely think so. Would it actually have been ‘better optimized’? No, absolutely not. Dying Light is a great example of just how difficult it can be to judge optimization, and also of the concerns developers might be limited by when implementing game options.

A similar issue occurred very recently with Deus Ex: Mankind Divided, which prominently featured an MSAA setting (up to 8x). As the game uses deferred shading—a common rendering technique that makes a straightforward hardware-accelerated implementation of MSAA much less viable—this setting came with the extreme performance impact one would expect in such a renderer. Again, the game might have been seen as far more 'optimized' had it not included that option, and again this is obviously not the case.

Game are rarely a single monolithic entity that is either optimized or unoptimized. There might be individual effects that are not very well optimized while the majority of the engine is. This is particularly common with novel features: When the original Crysis first included ambient occlusion, the effect was not particularly optimized compared to modern implementations. Similarly, the first implementations of voxel-based AO in Rise of the Tomb Raider—a well-optimized and beautiful game overall—might well be outperformed significantly in the future.

Deus Ex: Mankind Divided's deferred renderer makes AA much more demanding. Screenshot by Mary K.

Metro 2033 was one of the first games to broadly implement volumetric lighting, and was considered as a spiritual successor to Crysis in its performance impact at higher settings. And the same goes for recent and future implementations of contact-hardening shadows. However, I consider this experimentation with new features essential, even if their initial implementations might not be broadly usable due to their performance impact. This is how games progress toward more advanced, more optimized effects in the future. The beauty of the PC platform is that what was an 'unoptimized' effect in 2010 can be rendered at great framerates and make a game look better even on midrange hardware in 2016.

Of course, games that are well and truly unoptimized as a whole do exist. There’s usually a backstory: limited resources, a small developer biting off more than they can chew, or a lack of technical skill. Still, when a game stutters even at its default preset on powerful hardware, or a relatively simple 2D game drops down below 20 FPS on a modern console, then even the most well-meaning analysis leaves no room for other conclusions. Most cases, however, are more difficult to judge, and I hope that this section illustrated that point clearly.

Optimization challenges

The DirectX 12 question

Low-level APIs like DirectX12 are sometimes treated like a silver bullet in optimization discussions, but many games with DirectX 12 renderers have not shown significant benefits in performance.

For QLOC, unless the engine of a game already fully supports these new APIs, the effort of implementing low-level API support from scratch is not justifiable "for a mild improvement in performance that might anyway turn out to be non-existent once the port is completed".

Dean Sekulic agrees that you "really need to change the paradigm of the rendering engine" in order to reap significant rewards, but he sees enormous potential in Vulkan for the future. His half-joking advice to developers is, "If you use Vulkan, put more objects on screen"—implying that in the future, it could enable scenarios that would be hard or impossible to implement smoothly with current APIs.

The actual process of true code optimization (in the practical computer science meaning outlined earlier) is challenging. For Dean Sekulic, the 'worst optimization nightmare' is "looking at a profiler output and seeing the top function take 3% of the time." To give you a high-level idea of what this means, profilers are tools that tell a programmer how much time is spent in one particular function of code, usually sorting those functions by time spent.

When the top function takes 3% of the time, this means that even if you manage to improve it to be twice as fast—which might require herculean efforts—the complete program will only speed up by 1.5%. Since these situations are common in large, competently written code bases, optimizing them further becomes a gradual, laborious task.

QLOC, which ports games based on a wide variety of technologies, see their challenges vary a lot with each project. "Things can be really hard in one game, and far less troublesome in another." One common, difficult aspect they identify is making sure that the game performs well across a very wide range of hardware.

Overall, "there are no silver bullets" in program optimization, as Dean put it.

Optimization is not just about graphics!

Given the general focus of online discussions and reviews, as well as this article so far, this is a point that should be reiterated: optimization is a topic that concerns more than just graphics, although they obviously make up a very significant chunk of the processing time a game spends each frame.

As the development team at QLOC put it: "For us, optimization is also a lot about improving poor decisions with the controls, the gameflow and UI, tweaking the save/load system, improving netcode, and even fixing old bugs from the original title." I can imagine that this point of view is one appreciated by many who had to fight with mouse acceleration, input lag or byzantine UI decisions in other ports—and I’ve reported on a fair share of those myself. 

Rise of the Tomb Raider. Screenshot by Mary K.

Wrapping up

Despite all this detail, this article only very lightly scratches the surface of what goes into optimizing a modern PC game. I hope it provided some insight into the development process, as well as perhaps some hints on which settings to disable if you ever find yourself in need of a few extra FPS.

It wouldn’t have been possible in its current form without insight from the good people at QLOC and Croteam. Croteam offered the perspective of a long-term PC-first developer known for the technical quality of their games. QLOC is known primarily for their porting efforts, and their involvement usually makes PC gamers anticipating a game breathe a sigh of relief. Any errors you might find are mine, not theirs.

I’d like to close this article with two appeals:

When you compare the relative performance of games, try to take into account what they are actually accomplishing. As discussed above, realtime lighting and interactive objects are incomparably harder to present at the same fidelity as static scenes, and there are some graphical phenomena reality throws at us—even minor ones—that are performance intensive to replicate regardless of how optimized their implementation is.

In a similar vein, consider the idea that additional high-end graphics settings, even if they are not fully usable at the time of a release, are never a bad thing compared to not having those settings available in the first place. Their presence doesn't make a game unoptimized. I’ve always believed that coming back to a high-end game many years later and seeing it in even more splendor is one of the many major perks of the PC platform, and it would be sad to see this diminished due to shortsighted judgments about optimization.