Nvidia RTX 40-series to deliver 'up to 2–3 times increase in ray tracing' performance

Nvidia Lovelace GPU
(Image credit: Nvidia)

Nvidia has announced a massive bump in ray tracing performance with its next-gen Ada Lovelace graphics cards, and it's not all derived from the huge number of CUDA Cores that Nvidia has stuffed into its next-gen GPU (though it will have up to 18,000 of those).

One important part of that speed-up is a new technology that Nvidia's CEO Jensen Huang calls Shader Execution Reordering (SER). This "reschedules work on the fly" giving a 2/3X speed up for ray tracing on Ada Lovelace cards. Huang likens this to an engineering development of the same import as out-of-order execution was for CPUs, which is an important feature in pretty much every CPU today.

Then you have the obvious improvement: a new RT Core. The RT Core is the main driver of ray tracing performance in Nvidia's RTX graphics cards, and it had already seen one major overhaul with the RTX 30-series. With Ada Lovelace it will be bigger and better again, offering 200 RT TFLOPS performance and two times the ray-triangle intersection throughput. 

That's partially down to two new hardware units in the RT Core: a new opacity micromap engine, which "speeds up ray tracing of alpha test geometry by a factor of two times", and a new micromesh engine, which "increases geometric richness without the BVH build and storage cost."

Then there's the new Tensor Core. Tensor Cores are the accelerator behind instructions used for machine learning and those similar, and this new 4th Gen Tensor Core offers 1,400 TFLOPS of Tensor programming and the Hopper FP8 transformer engine, right out of the Hopper architecture Nvidia previously announced for data centres.

So this whole change essentially amounts to a big shift in ray tracing performance, and Huang explains it has been necessary to boost Ada Lovelace's ray tracing performance because "ray tracing is notoriously hard to parallelize." Essentially, ray tracing requires access to lots of different things on the GPU, at different times.

That's where SER comes in, as this block improves efficiency in ray tracing "by rescheduling shading workloads on the fly to better utilise the GPU resources".

"We're seeing up to two to three times increase in ray tracing and 25% in overall game performance," Huang continues.

The other part of the ray tracing puzzle, though not directly related to the GPU's hardware per se, is the release of DLSS 3.0, which Nvidia expects to once again help push frame rates higher with RTX features enabled, as DLSS has done so before. Of course we have more options than DLSS for super resolution nowadays, but it has proven a useful and adept option so it's good to see more improvements are on the way.

And of course, what use is information on the new Ada Lovelace architecture without word of the actual graphics card you can buy built around it: the RTX 4090. That's arriving on October 12 for $1,599. There are also two models of RTX 4080 on the way, a 12GB model and a 16GB model, starting from $899.

Jacob Ridley
Senior Hardware Editor

Jacob earned his first byline writing for his own tech blog. From there, he graduated to professionally breaking things as hardware writer at PCGamesN, and would go on to run the team as hardware editor. Since then he's joined PC Gamer's top staff as senior hardware editor, where he spends his days reporting on the latest developments in the technology and gaming industries and testing the newest PC components.