Ashes: single-GPU DX12 performance
A quick note on the graphs: We're including both average and 97 percentile frame rates, as usual. We've colored Nvidia cards in blue and AMD cards in red, but sorting presents a problem. Do we sort on the average, the 97 percentile, or a combination of the two? We've elected to sort based on the geometric mean of the average twice and the 97 percentile once, or Geomean(avg, avg, 97p). Basically, we're weighting the average more than the 97 percentile, but we're still including minimum frame rates.
Starting with the 1080p Crazy performance, how's this for a sobering thought: Not a single GPU is able to break 60FPS. Cranking up the resolution will only serve to drop performance even more, though the name of the setting says it all. There's also an Extreme setting that's probably a better target for most high-end configurations, but we wanted to see what would happen with Ashes at its most demanding. What's interesting is how big the gap is between the Fury X and the Titan X, with AMD leading by a solid 20 percent—and 25 percent over the 980 Ti. If this happened in more games, the Fury X launch would have been an entirely different story. The R9 390 is also ahead of the GTX 980 by nearly 10 percent, and that pattern continues down all levels of GPUs, though 970 and below all drop under the 30FPS mark.
Of course the question has to be asked: Is this a case of AMD's hardware being superior, and their DirectX 11 drivers basically sucking it up? Or is Oxide's game engine simply better tuned for AMD's hardware? Most likely it's a little of column A, and a little of column B. I've often wondered what AMD's performance would look like if they had the driver team resources of Nvidia. Even now, there are times where AMD's GPUs seem to stutter more than they should, and frankly their drivers often feel a step or two behind Nvidia. There are other tests out there that seem to confirm AMD's implementation of asynchronous compute in GCN has more potential than what's in Nvidia's Maxwell architecture, but the reality is most games don't show any benefits.
Our two other charts and settings don't generally show a big difference in performance, but we'll see some differences when we get to the multi-adapter testing. Basically, the High setting still needs more than 2GB of VRAM to really shine. And remember what we just said about drivers? Look at the difference between the R9 285 and the GTX 950/960. What's going on that AMD's 2GB card doesn't scale as well as their 4GB cards? There's a massive gap between the R9 380 4GB and the R9 285 2GB, as well as the GTX 960/950 2GB cards; but the gap between the 380 and the 960/950 narrows quite a bit at the High setting compared to Crazy, with the 285 dropping to the bottom of the pack. Everything else basically stays reasonably consistent with the Crazy numbers.
It looks like the big factor with Ashes appears to be having at least 4GB VRAM, unless you're running at low quality settings. 4GB vs. 8GB VRAM on the other hand doesn't show much of a separation, judging by the R9 290X and R9 390. The R9 390 is always the faster card, but the lead is a steady 5-10 percent, regardless of settings. Things might change if we pushed for a higher resolution, but it would make sense to target 4GB VRAM, considering that's where a lot of these higher-end cards sit.
With the single GPU stage complete, we now have a better idea of how to pair up cards for our multi-GPU testing. This is barely the beginning, with only twelve configurations tested. We're going to break things up into two categories for multi-GPUs: Homogeneous, where we use two AMD or Nvidia cards, and heterogeneous, where we mix AMD and Nvidia cards. Just how many more configurations did we manage to test? Fifty. Nine. We do the hard work so that hopefully, you won't have to!
Homogeneous dual-GPU DX12 performance
Even though we're pairing up AMD cards with AMD cards, and Nvidia cards with Nvidia cards, note that the vast majority of configurations are not at all like what you would normally see with an SLI or CrossFire setup. In fact, outside of R9 390 + 290X and R9 390X + 390, we don't even have any pairs of AMD cards that would normally run in CrossFire. We also didn't have our Nano or vanilla Fury cards on hand, which further limits some of our combinations, but we've tested what we currently have.
Before we get to the charts, one more item of information is worth pointing out. Normally when we use a game to test performance, we see some variation between benchmark runs, but it's pretty small. Ashes follows that same pattern when testing single GPUs, most of the time, but when we switch to multiple GPUs things can get a little messy. There are times when adding a second GPU actually drops performance compared to using just the faster of the two GPUs, typically when the two GPUs aren't very closely matched. There are other times when a configuration that by all appearances should perform better doesn't work as expected.
With DX12—or any other low-level API—the developers are responsible for far more of the rendering process. Algorithms that work great on one combination of hardware may behave erratically with only a seemingly small change in hardware. We've also noticed quite a few cases where rendering would totally break, and usually we'd expect something that doesn't work right once to always be broken. That doesn't happen either. Most of the rendering issues we encountered were on the Crazy preset, but there were a few anomalies on High and Standard as well. We did try verifying any unusual results, though one or two may have slipped through the cracks, but we'll comment on some of the more consistent issues as well as unexpectedly high performance combinations below.
With 1080p Crazy, AMD's Fury X paired with the 390 takes the lead. Average frame rates are slightly lower than the Titan X + 980 Ti, as well as the dual 980 Ti setup, but minimum frame rates are much better. If that's all there was to the story, we'd move on, but check out the Nvidia pairings where the second card is a 970 and you'll notice something unusual. The 980 Ti, 980, and 970 all work quite well when the second GPU is a 970; this seems like a pattern, but then we find the Titan X + 970 sitting way down the charts. It doesn't really make much sense, but our best guess is that Oxide's algorithms or whatever don't do so well when the primary GPU has substantially more RAM than the secondary GPU.
There are other clear performance issues as well. A dual GTX 980 Ti has good average frame rates, but the 97 percentiles are comparatively poor. Dual GTX 980 is slightly slower on both average and 97 percentiles, but dual GTX 970 does really well. Given the GTX 970 is arguably one of the most popular GPUs right now, it could be that Oxide has simply optimized better for that Nvidia card. But what you really need to understand is that there are lots of cases where things just aren't consistent. One run a particular pairing could run smoothly, but the next four runs it might stutter horribly. Rather than trying to run each particular set of cards enough times to get a statistically meaningful distribution of performance, we're using the current results to show that there is plenty of work still remaining.
All of the oddities from testing the Crazy settings largely disappear at the High preset. We're also pretty clearly hitting some system bottlenecks at this point, where many configurations have average frame rates in the mid-90s, with 97 percentiles in the low-to-mid 60s. Perhaps using the Crazy preset was a bad idea, or at least it tends to break down more than the High preset. Matched or nearly matched GPUs now take the top spots, with Nvidia just barely edging out AMD (at a much higher price point). In terms of bang for the buck, the two best options we tested are the nearly matched Hawaii cards, 390 and 290X, and the matched GTX 970 cards.
At the Standard preset (without AA), performance is mostly in line with the High preset. The major difference is that cards with 2GB VRAM don't struggle nearly as much. That makes sense, considering we saw the same thing in testing with single GPUs. Our CPU also continues to struggle to break the 100FPS average, and for the faster GPUs we actually see several cases where performance is worse here than at the High preset. Yeah, how do you get hardware to do less work more slowly?
If we get a chance, we may see about benchmarking a few of the better pairings on one or two other CPUs and see what happens. Running a 6-core i7-5930K at 4.2GHz should normally ensure the CPU isn't much of a bottleneck, but DX12 might be changing the status quo. Could a quad-core Skylake system end up delivering better overall performance, or will it be worse? What about AMD's FX series of CPUs, and their Kaveri APUs? We had hoped to check those out, but time has not been on our side.