If current indications are to be believed—those indications being signs and teasers direct from Nvidia itself—the company's next graphics card is called the Nvidia GeForce RTX 2080, and it will launch (or at least be officially revealed) on August 20. But there are still a lot of questions, like what will RTX 2080 performance be like, and how much will the RTX 2080 cost? Let's dive in.
It's been a while since Nvidia introduced its last new graphics architecture for gaming GPUs. That last architecture was Pascal, and it has powered everything from the top-tier GTX 1080 and GTX 1080 Ti to the entry-level GTX 1050 and GT 1030—it's in many of the best graphics cards you can buy right now. The next generation of Nvidia graphics cards is finally approaching, using the Turing architecture. Here's what we know about the Turing, the RTX 2080, what we expect in terms of price, specs, and release date, and the winding path we've traveled between Pascal and Turing.
Details emerge for the GeForce RTX 2080
What ray-tracing means for games
Why is ray-tracing such a big deal, and what does it mean for games? We wrote this primer on ray-tracing when Microsoft unveiled its DirectX Raytracing API.
Previously, there has been a ton of speculation and, yes, blatantly wrong guesses as to what the Turing architecture would contain. Let me just put this out there: Every single one of those guesses was wrong. Chew on that for a moment. All the supposed leaks and benchmarks? They were faked. Even the naming guesses were wrong. Nvidia's CEO Jensen Huang unveiled many core details of the Turing architecture at SIGGRAPH, finally putting to bed all the rumor-mongering. Combined with a teaser video for the next GeForce cards and we now know most aspects of what will be in the GeForce RTX 2080.
Nvidia has been extremely tight-lipped about its future GPUs this round, but with an anticipated announcement at Gamescom as part of Nvidia's GeForce gaming celebration, plus the SIGGRAPH Turing architecture details, the name is now pretty clear. GTX branding is out, RTX (for real-time ray-tracing) is in; 11-series numbers are out, and 20-series numbers are in. Nvidia also recently trademarked both GeForce RTX and Quadro RTX, and while it's possible GTX parts might coexist with RTX parts, I'd be surprised if Nvidia chose to go that route. The new cards apparently start with the GeForce RTX 2080 and will trickle down to other models over the coming months.
Moving on to the Turing architecture, this is where Nvidia really kept some surprises hidden from the rumor mill. The Volta architecture has some features that we weren't sure would get ported over to the GeForce line, but Nvidia appears ready to do that and more. The Turing architecture includes the new Tensor cores that were first used in the Volta GV100, and then it adds in RT cores to assist with ray-tracing. That could be important considering Microsoft's recent creation of the DirectX Raytracing API.
The Quadro RTX professional GPUs will have both core types enabled, though it's still possible Nvidia will flip a switch to disable the Tensor cores in GeForce—the RT cores on the other hand need to stick around, or else the GeForce RTX branding wouldn't make sense. However, Nvidia also revealed a new anti-aliasing algorithm called DLAA, Deep Learning Anti-Aliasing, which implies the use of Tensor cores.
Initially, Turing GPUs will be manufactured using TSMC's 12nm FinFET process. We may see later Turing models manufactured by Samsung, as was the case with the GTX 1050/1050 Ti and GT 1030 Pascal parts, but the first parts will come from TSMC. One particularly surprising revelation that comes by way of the Quadro RTX announcement is that the top Turing design will have 18.6 billion transistors and measures 754mm2. That's a huge chip, far larger than the GP102 used in the GTX 1080 Ti (471mm2 and 11.8 billion transistors) and only slightly smaller than the Volta GV100. That also means the new RTX 2080 will likely remain as the top product in the 20-series stack—or else Nvidia will call it the RTX 2080 Ti.
What does the move to 12nm from 16nm mean in practice? Various sources indicate TSMC's 12nm is more of a refinement and tweak to the existing 16nm rather than a true reduction in feature sizes. In that sense, 12nm is more of a marketing term than a true die shrink, but optimizations to the process technology over the past two years should help improve clockspeeds, chip density, and power use—the holy trinity of faster, smaller, and cooler running chips. TSMC's 12nm FinFET process is also mature at this point, with good yields, allowing Nvidia to create such a large GPU design.
We also know maximum core counts, for the Tensor cores, RT cores, and CUDA cores—or at least, we know the target speed for the RT cores (Nvidia hasn't discussed the specifics of how many RT cores are used). The top Turing design allows for up to 4,608 CUDA cores, an increase of 20 percent relative to the GP102, and 29 percent more than the GTX 1080 Ti. Turing can deliver 16 TFLOPS of computational performance from the CUDA cores (FP32), which indicates a clockspeed of around 1700MHz. That's also about 35 percent faster than the GTX 1080 Ti, if the same performance gets put into a GeForce card.
Turing also has 576 Tensor cores capable of 125 TFLOPS of FP16 performance (576 * 64 * 2 * 1700MHz again), and the RT cores can do up to 10 GigaRays/sec of ray-tracing computation—25 times faster than what could be done with the general-purpose hardware found in Pascal GPUs. Finally, the Turing architecture introduces the ability to run floating-point and integer workloads in parallel, at 16 trillion operations for each, which should help improve other aspects of performance.
Nvidia appears to have a second Turing design with up to 3,072 CUDA cores, 384 Tensor cores, and 6 GigaRays/sec of RT cores. It's possible this is just a harvested version of the larger chip, but that would mean disabling about a third of the design and that's not usually necessary. A smaller chip would be a good candidate for RTX 2060, with RTX 2070 using a harvested version of the larger design, or the naming may shake out differently. Regardless of what names use which chips, Nvidia will have the ability to disable certain units within each design, allowing for various levels of performance.
Moving along, Turing will use GDDR6 memory. Based on the Quadro RTX models, there are two chip designs, one with a 384-bit interface and 24GB/48GB of GDDR6, and the other with a 256-bit interface and 16GB GDDR6. Nvidia is avoiding the use of HBM2, due to costs and other factors, and GDDR6 delivers higher performance than GDDR5X. While GDDR6 officially has a target speed of 14-16 GT/s, and Micron has demonstrated 18 GT/s modules, the first Turing cards appear to go with the bottom of that range and will run at 14 GT/s. Nvidia states that Turing will use Samsung 16Gb modules for the Quadro RTX cards, so it looks like it's going whole hog and doubling VRAM capacities for the upcoming generation of graphics cards (unless there will also be 8Gb GDDR6 modules).
With a transfer rate of 14 GT/s, the 384-bit interface has 672GB/s of bandwidth, and the 256-bit interface provides 448GB/s. Both represent massive improvements in bandwidth relative to the 1080 Ti and 1080/1070. We may also see higher clocked GDDR6 designs in the future, potentially with a narrower bus. Nvidia hasn't done a deep dive on the ray-tracing aspects yet, but I suspect having more memory and more memory bandwidth will be necessary to reach the performance levels Nvidia has revealed.
The above image shows a die shot of Turing, for the largest design with up to 4,608 CUDA cores. From a high level, there are six large groups that are repeated around the chip. Each group in turn has 24 smaller clusters of chip logic, and within each of those clusters there appear to be 32 small blocks. 24 * 6 * 32 = 4,608, indicating the smallest rectangular shapes are CUDA cores.
If Nvidia sticks with its recent Pascal and Maxwell ratios, four blocks of 32 CUDA cores make up one streaming multiprocessor (SM) of 128 CUDA cores, giving Turing 36 SMs in total. 16 Tensor cores in each SM give the 576 total Tensor cores as well, and each SM could come equipped with two 32-bit interfaces to give the 384-bit GDDR6 interface.
For the smaller variant of Turing, chop the above design down to four main clusters instead of six. That gives the expected 3,072 CUDA cores across 24 SMs, 384 Tensor cores, and 256-bit interface. So far so good.
The big question is where the RT cores reside. We don't have any figure for how many RT cores are in the architecture, just a performance number of 10 GigaRays/sec. The RT cores are likely built into the SMs, but without a better image it's difficult to say exactly where they might be. The RT cores might also reside in a separate area, like the center block. We'll have to wait for further clarification from Nvidia on this subject.
How much will the RTX 2080 cost?
With the revelation of the RTX branding and a focus on real-time ray-tracing as the hot new technology, pricing and other aspects become far less clear. Nvidia might be revamping the entire product stack to turn the RTX 2080 into a $1,000 GPU, or it might hold off on such a high-end part and instead release RTX 2080 using the smaller of the two chips discussed above. That would allow Nvidia to follow up with an RTX 2080 Ti or Titan using the larger die for a spring refresh.
The smaller Turing design seems a safer bet for a high-end $700-$800 GPU. Even with 'only' 3,072 CUDA cores, architectural enhancements could allow such a design to outperform the GTX 1080 Ti in existing gaming workloads, with ray-tracing as a new feature that can only be done with the new hardware. The ability to do both integer and floating-point workloads in parallel could be part of Nvidia's plan to do more with less. Die size would also be far more manageable, at around 500mm2 compared to 754mm2.
Given the various component costs, I can't imagine a 24GB GeForce RTX 2080 (or RTX 2080 Ti) costing less than $1,000, and $1,500 isn't out of the question. Nvidia calls this the biggest jump in architecture in more than a decade, since it released the first GPUs with CUDA cores. A big jump in features often means a similar jump in pricing.
Best guess is that the RTX 2080 that launches in August or early September will use the smaller Turing die with 3,072 cores. It will have 16GB GDDR6 and use clockspeed and architectural improvements to surpass the existing GTX 1080 Ti. The door remains open for a future 20-series Ti part that costs more and offers even higher performance.
When is the RTX 2080 release date?
Nvidia spilled the beans on many aspects of the Turing architecture at SIGGRAPH, indicating the launch of GeForce models is imminent. Nvidia has also planned a GeForce gaming celebration for August 20 at Gamescom, and we expect to hear final specs and pricing at that point. If it's like the 1080 Ti launch, official reviews and hardware should follow not too long after. The current information suggests a staggered rollout of the 20-series GPUs as follows:
- GeForce RTX 2080: August 30
- GeForce RTX 2070: September 27
- GeForce RTX 2060: October 25
The 2070 and 2060 dates are based on a rumor, so other than the initial launch date for the RTX 2080 take those with a grain of salt. They're all likely accurate to within a couple of weeks, however, and launching all the next generation GPUs in the holiday shopping season makes a lot of sense.
What about the GeForce RTX 2070?
Nvidia has a well-trodden path when it comes to graphics card launches. It starts with a high-end card like the 780/980/1080, and either simultaneously or shortly afterward releases the 'sensible alternative' GTX 770/970/1070. With the 900-series, both parts launched at the same time, while the 700-series had a one-week difference and the 10-series had a two-week gap between the parts. The slightly staggered rollout will probably happen with the 20-series, so 2-4 weeks between the 2080 and 2070 launches.
As for specs, again Nvidia's standard practice is to offer a trimmed down version of the same GPU core, essentially harvesting chips that can't work as the full 2080 and selling them as a 2070. The idea is to end up with performance about 20-25 percent slower and a price that's 30-35 percent lower.
Current indications are the GTX 2070 will ship in late September, along with custom GTX 2080 cards. The current GTX 1070 Ti has 2,432 CUDA cores while the GTX 1070 has 1,920 CUDA cores. Using a harvested variant of the 3,072 CUDA core 'small Turing' die would still be a large jump in potential performance, never mind the other things like ray-tracing, Tensor cores, etc. Architectural enhancements would allow Nvidia to keep core counts similar to the 1070 Ti while still delivering substantially better performance. I'm not sure an RTX 2070 with 2,560 CUDA cores and a 256-bit memory interface can surpass the performance of the GTX 1080 Ti in current games, but anything that uses the RT cores for ray-tracing will be an easy win.
In terms of memory, GTX 2070 will also use GDDR6, as it means the same boards can be the same for the 2070 and 2080. Slightly lower clockspeeds for the RAM are also typical, or perhaps Nvidia will disable one of the memory interfaces (eg, 12GB of GDDR6 with a 192-bit interface). The new parts shouldn't have less memory than the current generation, so either 12GB or 16GB is likely. Expect initial prices to be in the $450-$500 range.
Will there be an RTX 2080 Ti or RTX Titan card?
Almost certainly, though whether it launches sooner or later remains unknown. The Titan V exists, but it's in a different category and lacks the RT cores. Titan V will remain an option for machine learning and artificial intelligence, but a new RTX Titan should eventually displace it with similar performance plus improved ray-tracing performance, and maybe even a lower price.
Nvidia has already discussed the Quadro RTX 6000 and 8000, which use a massive die—it's nearly as large as the GV100. Building an RTX Titan that has similar specs to the Quadro RTX 6000 but without a few of the Quadro features (like professional drivers for CAD/CAM applications) makes sense. Having an RTX Titan with 24GB GDDR6 priced at $1,499 that outperforms the Titan V will make the extreme pricing seem almost justifiable. And with NVLink, SLI support can still happen, something the Titan V lacks.
The Quadro RTX cards aren't slated to begin shipping until Q4 2018, which admittedly is only six weeks away at this point. Keeping a new Titan in the wings for a spring refresh would be in line with Nvidia's past behavior, and there's not much room to go bigger than the 4,608 core Turing design without a shrink to 10nm or even 7nm. The 1080 and 1070 have been around for over two years, and it will probably be more than two years before the 20-series parts get replaced, so Nvidia needs pace its rollout of new hardware.
Could an RTX 2080 Ti or RTX Titan launch in 2018? Yes. And the naming, RT cores, and other surprise elements of the upcoming 20-series parts prove that Nvidia is willing to diverge from past behavior. If and when the new extreme performance RTX 2080 Ti and Titan cards become available, I expect them to cost quite a bit more than the existing GTX 1080 Ti and Titan Xp.