After years of waiting, AMD is set to launch its first post-GCN architecture graphics cards next month. This is supposed to be a fundamental reset of AMD's graphics processors, tweaking and tuning every element of the design to improve performance and efficiency—or at least, that's the official word.
There are many changes with Navi, the first new GPU architecture for AMD graphics cards since its RX Vega line in 2017, but after the official reveal of specs and pricing at E3 2019, I have to wonder if this is truly new or simply the next iteration on the existing product line. After many years of playing second string to Nvidia's leading parts, I wanted a clear win from AMD. Navi, sadly, doesn't look to be it. AMD claims IPC performance improvements of 25 percent per CU, and says that overall performance per watt will be 50 percent better than its previous generation Vega and Polaris architectures.
That sounds great, but when you dig into the details, it looks like at best AMD might match the performance of Nvidia's RTX 2060 and 2070 GPUs, except AMD will use more power, potentially cost a bit more, and Navi doesn't include any ray tracing or deep learning features. Considering AMD is using TSMC's latest 7nm manufacturing process, it's a bit of a letdown—a lot like the Fiji, Polaris, and Vega GPUs of the past several years.
Perhaps the RX 5700 XT will perform better than I'm expecting, and certainly a price drop would go a long way to make the new cards more attractive. There's a popular saying with computer hardware: There are no bad products, only bad prices. The initial cards could end up being more of an early adopter Founders Edition launch, with a price drop after the first wave. But whatever happens, there's a ton of information to discuss, so let's get to it.
AMD Navi specs, release date, and pricing
Officially, AMD will launch three RX 5700 series cards on July 7, 2019. There will be a limited RX 5700 XT 50th anniversary edition with slightly higher clocks and gold trim, but the two main models are the Radeon RX 5700 XT and the Radeon RX 5700. Here's the short overview of the specs and pricing of the three models, with comparisons to Vega 64 and RX 590/580:
The RX 5700 XT comes with 40 CUs, each with 64 stream processors, for a grand total of 2,560 graphics cores. It will include 8GB of GDDR6 memory clocked at 14GT/s, good for 448GB/s of bandwidth. Each core can execute an FMA (fused multiple add) every clock cycle, which means two ops per core, and with a boost clock of 1905MHz that works out to a peak computational rating of 9754 GFLOPS (billions of operations per second). The base clock is 1605MHz, while the new Game Clock is 1755MHz (more on that in a moment). The RX 5700 XT also has 160 texture units and 64 ROPS. AMD rates the 5700 XT at a total board power (TBP) of 225W. It will have a launch price of $449 for the reference model, with custom designs from AMD's graphics cards partners coming after the initial launch.
The RX 5700 XT anniversary edition is identical to the 5700 XT in most specs, but it comes with a shroud signed by AMD CEO Lisa Su, gold accents to replace the typical AMD red, and slightly higher clockspeeds. With a 1980MHz boost clock, that gives a peak performance of 10138 GFLOPS. TBP is slightly higher at 235W, and the price for the limited edition 50th anniversary card is $499.
Stepping down to the RX 5700, AMD disables four CUs on the GPU, giving 2304 total streaming processor cores. That also reduces the texture unit count to 144 (some initial slides from AMD had the wrong figure for TMUs). It still comes with 8GB of 14GT/s GDDR6 memory and 64 ROPS, but clockspeeds are quite a bit lower at 1725MHz boost, 1465MHz base, and 1625MHz for the Game Clock. Using the boost clock, that gives the RX 5700 a peak performance of 7949 GFLOPS, with a substantially lower TBP of 180W. The initial launch price is set to be $379.
AMD is the first to manufacture a consumer graphics card that uses PCIe 4.0. That doubles the theoretical bandwidth compared to PCIe 3.0, though it's important to note that PCIe speed often isn't a major factor in gaming performance. Graphics cards have a bunch of high-speed VRAM to avoid transferring data over the PCIe bus as much as possible. That's because even an x16 PCIe 4.0 link can only transmit up to 31.51GB/s—a fraction of the bandwidth of the GDDR6 memory. There may be a few edge cases where PCIe 4.0 is useful (CrossFire, or GPU compute workloads), but they're more likely to be in the professional space than in consumer graphics.
All three 5700 models use the same 'Navi 10' GPU, which has a maximum of 40 CUs. It's manufactured using TSMC's latest and greatest 7nm FinFET process, the same process that's used on AMD's upcoming Ryzen 3000 CPUs—as well as Apple's A12 SOC that launched in September 2018. AMD lists the transistor count as 10.3 billion, with a die size of 251mm2. That's substantially smaller than Vega 10 (12.5 billion and 495mm2), and compared to Nvidia's TU106 in the RTX 2060/2070 (10.8 billion and 445mm2) AMD has only slightly fewer transistors packed into a far smaller area.
Those are the core specs, but there are a few extra pieces of information that aren't really covered. First, even though the theoretical performance per streaming core is the same—one FMA per clock—AMD has reworked the architecture relative to previous GPUs and claims 25 percent better IPC, and 50 percent better performance per watt.
You can't really compare raw GFLOPS or TFLOPS numbers (billions or trillions of floating point ops per second) among different GPUs anyway, because architectures behave differently. However, if AMD's 25 percent claim is accurate, that means the 9.75 TFLOPS RX 5700 XT should perform close to a 12.19 TFLOPS Vega GPU. The actual Vega 64 is rated at up to 12.66 TFLOPS, and AMD says the 5700 XT is typically slightly faster (though I haven't been able to verify those numbers yet). Keep in mind that Nvidia's GTX 1080 has a theoretical compute performance of 8873 GFLOPS, but outperforms the Vega 64 on average in our gaming tests, so Nvidia still appears to extract more real-world performance from its hardware.
The second item is the new "Game Clock" specification. This is a conservative estimate of the typical clockspeeds users will get from the GPU while playing games. This is basically the same as Nvidia's boost clock—Nvidia's cards routinely run at speeds well above the rated boost clock in my experience—and as Nvidia puts it, it's better to under promise and over deliver than the other way around. So the Game Clock is a change of tune from previous AMD cards where the boost clock was more of an optimistic maximum clock for the GPU. There's just one problem.
AMD is reporting computational performance using the boost clock, or at least it used that number in its E3 press briefings and on the product specs pages. We can't say for certain what clockspeeds the Navi GPUs will use during gaming sessions, but AMD's own statements suggest the Game Clock is a better estimate than the boost clock. That would give the RX 5700 XT a rating of 8986 GFLOPS and the RX 5700 drops to 7488 GFLOPS. But keep in mind what I said just a moment ago about not being able to directly compare GFLOPS to determine gaming performance. Drivers and other elements still factor into the equation, even if Navi does deliver better throughput than previous AMD GPUs.
The AMD RX 5700 RDNA architecture
Note: the above slides provided by AMD provide much of the material for the following architectural discussion.
It's no secret that AMD GPUs have fallen behind Nvidia offerings, in performance, efficiency, and features. This has been the status quo dating back to at least 2014, when Nvidia's Maxwell architecture improved efficiency without sacrificing performance. Since then, each subsequent generation of AMD GPUs has typically used 50-60 percent more power than Nvidia GPUs of similar performance, and at the top of the product stack AMD has been unable to clearly beat Nvidia's fastest GPU since the launch of the GTX 780. AMD graphics cards could still compete on price, and DirectX 12 games typically favor AMD's architectures, but with Navi AMD is finally doing a major redesign.
The problem with graphics architectures is that often we're given very little low-level information. We know how many Compute Units, cores, texture units, and ROPS are in a particular GPU, but other statements are often a bit nebulous. Part of that might be to hide the 'secret sauce' from the competition, but regardless, much of what gets said about graphics architectures has to be taken on faith. I specifically asked AMD engineers whether RDNA is 'truly new' or more of a refinement of GCN, and was repeatedly told that it is a new architecture. But when the same people describe the architectural changes, it still sounds like a refinement and rearrangement of functional units rather than a full reset.
Let's start at the top, with the name change from GCN (Graphics Core Next) to RDNA (Radeon DNA)—it's a new name, which doesn't say much. But there are substantial updates to the underlying architecture. RDNA has a new "dual compute unit" design, with some shared resources including a new L1 cache. Having an L1 cache for a GPU is sort of new, as far as I'm aware. Previously, AMD and Nvidia have included an "L0" cache (basically immediate access) and an L2 cache, with the VRAM functioning as something of a huge but slower L3 cache. CPUs meanwhile don't have an L0 cache as such, but they do have L1/L2/L3 caches. AMD says the addition of the L1 cache helps improve latency and throughput, which in turn improves efficiency.
Perhaps the biggest change is that AMD has changed the format of its wavefront instruction dispatch for RDNA from Wave64 (64 threads) to Wave32 (32 threads)—though it can also split a Wave64 into two Wave32 groups. There's a lot more going on than just a minor change in thread grouping, however. On GCN, a single Wave64 instruction would get executed on a SIMD16 unit (Single Instruction Multiple Data) over four cycles. RDNA changes to a SIMD32 unit, which now allows a single Wave32 instruction to execute in one cycle.
Depending on the work being done, a single RDNA Wave32 can execute in about half the time as a GCN Wave64, and even when both GCN and RDNA are running Wave64, RDNA should reduce instruction execute times by 44 percent. Another change relative to GCN is that each CU now has twice the number of scalar units and schedulers. This is to help keep the new Wave32 units occupied with work.
Getting back to the dual CU design, AMD calls this a Work Group Processor. In many GPU workloads there's a lot of overlap—like shading thousands of pixels at a time in a similar fashion. Each WGP shares a shader instruction cache, scalar data cache, and local data share, and when the two CUs reuse any of the items in those caches, the GPU avoids a slower (and less efficient) trip to the L2 cache or VRAM. The load bandwidth of the L0 to ALU has also been doubled, up to 64 bytes per clock read speeds and 128 bytes per clock writes.
The last of the architectural changes include improvements to the delta color compression (DCC) algorithm, which allows the shaders and display blocks to work directly with compressed data. That means better use of the available bandwidth for the entire GPU. AMD has also taken some of the learnings from its Zen CPU architecture and applied them to graphics, reducing the number of transistors that switch (consume power) per clock through improved clock gating. That means when a portion of the GPU is idle, rather than continuing to rev up the engine, it can quickly slow down and start up as needed.
Ultimately, these changes improve execution latency, improve efficiency, and boost performance. How much each part contributes to the whole isn't immediately clear, but AMD claims that in total the CUs in RDNA are a 25 percent improvement in IPC (Instructions Per Clock) relative to GCN. But RDNA also enables higher clockspeeds, which ends up making it 50 percent faster at the same power level as GCN (Vega). That means in theory AMD could have a 225W RDNA GPU that will perform 50 percent better than a 225W GCN GPU. Or put another way, the RX 5700 XT should be about 50 percent faster than the RX 590—or if you prefer, the 225W RX 5700 XT with 40 CUs is able to slightly outperform the 295W Vega 64 (about 25 percent less power and perhaps 10 percent better performance).
There are a lot of other factors that go into RDNA and the RX 5700 series. The GPUs are manufactured on a new 7nm process, there are power and efficiency improvements, and finally there are architectural updates that improve performance per clock. AMD says that the 50 percent improvement in performance per watt ends up split among these, but with the majority of gains coming from the architecture—over 50 percent. The 7nm process node also contributes a big chunk of around 30 percent of the total gains, with the final 15 percent or so coming from the power and efficiency improvements.
All of these changes apply to RDNA in general, but at present AMD has only announced a single GPU—Navi 10. That chip is used in all three RX 5700 models, which is similar to what we saw with the previous generation Polaris 10 GPUs (RX 470/480 and RX 570/580). There are rumors of a Navi 12 GPU that will go into a lower cost and lower performance product (RX 5600). But the bigger deal will be a larger Navi 20 GPU rumored for launch in 2020.
Unlike Navi 10/12, Navi 20 will include ray tracing features. It's assumed a smaller variant of Navi 20 will be in the next generation PlayStation and Xbox consoles, with a larger model going into future extreme performance graphics cards. Basically, that will be the sequel to the Vega 64 and Radeon VII. The concern of course is that it could once again be too little, too late. Nvidia's RTX 20-series parts haven't exactly been flying off the shelves, but they still offer features AMD doesn't match with a similar price and better efficiency. Nvidia will likely have RTX 30-series on 7nm sometime in 2020. But that's a story for another day.
AMD Radeon RX 5700 performance expectations
I don't have my own benchmark results for RX 5700 XT or RX 5700 yet, so the only thing to go off is AMD's initial performance claims. I have a rule of thumb when it comes to manufacturer numbers: subtract five percent. It's not that AMD, Intel, and Nvidia provide false numbers, but usually the benchmarks that are shown tend to be slightly more in favor of their products than a 'typical' average. And what AMD showed at E3 is performance for the RX 5700 XT that's on average 31 percent faster than the RX Vega 56—and about six percent faster than an RTX 2070.
That would be pretty much a tie based on my performance testing of the RTX 2070, after subtracting five percent from AMD's result, because across a test suite of 13 games I show the RTX 2070 beating the RX Vega 56 by 29 percent at 1440p and 4K. Basically, the RX 5700 XT and RTX 2070 are going to be very close in performance, and a few percent more or less isn't really going to radically change things. RTX 2070 performance is good—better than the GTX 1080 and Vega 64—so AMD certainly has a capable card based on these early numbers.
The RX 5700 is potentially even more interesting. Using the same ten games, AMD's numbers have the RX 5700 beating the RTX 2060 by 11 percent. Unfortunately, AMD doesn't provide any other performance reference, but based on specs it looks like the RX 5700 will be roughly 15-20 percent slower than the RX 5700 XT. That's still faster than an RX Vega 56, and again likely faster than RTX 2060. AMD should also gain an advantage by having 8GB of GDDR6 and 448GB/s of bandwidth.
Except, Nvidia appears to have other plans in store. Official pricing and specs haven't been revealed yet, but all indications are that Nvidia will be releasing RTX 2060 Super and RTX 2070 Super graphics cards in the near future. Word is that the 2060 Super will also have 8GB of GDDR6, plus additional CUDA cores compared to the current 2060, and the same goes for the 2070 Super—though it will still be 'limited' to 8GB GDDR6. (A 2080 Super is also in the works, at a higher price and performance target than anything AMD has planned.) If Nvidia keeps TDP (TGP) and price the same and boosts performance by 15-25 percent, which seems plausible, the Radeon RX 5700 cards will suddenly fall short of the new competition.
But remember what I said earlier: there are rarely bad products, only bad prices. TSMC 7nm likely costs 20-40 percent more than TSMC 12nm, just because it's the new cutting-edge process and is more complex, but with a die size of 251mm2 AMD should be making a healthy profit on the RX 5700 with the launch prices. Don't be surprised if retail prices drop after a few months by $50 or even $100 to compensate for any perceived lack of performance and features relative to Nvidia's cards.
Commence the countdown
At this stage, we know just about everything other than independent benchmarks, overclocking, and street pricing. AMD appears to be launching the RX 5700 series cards as reference designs first, with custom cards coming a month or two later. So on July 7, all the RX 5700 XT cards will look the same, other than a manufacturer logo (eg, Sapphire, Asus, Gigabyte, MSI, XFX, etc.) Custom cards will come a month or so later, and some will also have factory overclocks, though manual overclocking of the reference cards will be possible.
I don't want to jump ahead without cards in hand, but the prices are definitely higher than many expected. All the rumors leading up to E3 talked about AMD going after the mainstream market where most GPUs are sold—just look at the Steam Hardware Survey and you'll see GTX 1060, 1050 Ti, and 1050 are the top three cards and combined they represent roughly a third of all Steam PCs. The 1070 sits in fourth place, a card from three years ago that was our recommended buy for most of that time. Launching at $379 and $449 could be a showstopper for many potential buyers.
(And yes, you can insert the typical disclaimers about Steam's statistics and what we do and don't know. But the charts look pretty reasonable in terms of market share on most cards. The main point is that truly mainstream cards that cost less than $250 tend to be far more popular than cards that cost $380 or more.)
Of course many of the early 'rumors' and 'leaks' felt like fanboy fantasies. An RTX 2080 competitor for $249? Yeah, that was never going to happen. But the $449 price would have been far more acceptable if we were looking at something closer to RTX 2080 rather than an RTX 2070 competitor. Plus there's still that whole business with Nvidia's Super cards.
This is the problem with Nvidia having such a controlling position in the graphics card market. Nvidia launched RTX ten months ago, at prices many outright rejected. The cards are technically faster than the outgoing model they replaced, and they have new features that you may or may not need (ray tracing and DLSS), but each level of performance increased the price. A GTX 1080 Ti cost $699, and the RTX 2080 is slightly faster (3-4 percent on average in my current test suite) at the same $699 price point. Or if you prefer, the RTX 2070 is about 10-13 percent faster than the GTX 1080 at the same $499 target, while the RTX 2060 is 12-15 percent faster than the GTX 1070 for $349. The new norm is that each model series jumped one level in pricing.
Now AMD shows up almost a year later with hardware that apparently offers similar performance but lacks some features, and targets roughly the same Nvidia price categories. That's not what we wanted, and Nvidia's response should have been anticipated: speed bumped models with extra cores, memory, and/or clockspeeds at similar prices, to ensure Nvidia stays ahead. We don't have performance numbers for RTX 2060 Super or RTX 2070 Super yet, but if they keep the $349 and $499 pricing of the non-Super cards and boost performance 15-25 percent (which seems entirely possible), AMD GPUs are back to the old status quo for high-end offerings: pay more, get less.
Sometime in the next two weeks I expect to have performance numbers for AMD's Radeon RX 5700 cards. It's possible they'll land higher than expected. I also expect I'll get some Super Nvidia cards that will change the targets AMD needs to beat. We'll have to wait for the final verdict, but I'm getting a sense of déjà vu. But don't worry, there's always RDNA2—coming in 2020!