In a bid to rebrand every bit of its graphics card architecture, Xe, Intel has killed off the humble Execution Unit. The EU is the fundamental building block of Intel's graphics architectures, but after years of use, Intel is putting the term out to pasture to make way for a new unit, the Xe-core.
The change in name comes as Intel says there's too much inside a single EU to really classify it as an EU nowadays.
This doesn't totally change what an EU/Xe-core actually does, though. An Intel EU currently contains multiple ALUs for floating point and integer operations, and the same goes for an Xe-core. However, the new Xe-core nomenclature does play into the significant shake up incoming with Intel's first discrete gaming graphics card generation, codenamed Intel Alchemist.
The new Xe-core within Alchemist GPUs comes with 16 vector engines and 16 matrix engines. That's actually double what's found within the Ponte Vecchio GPU, which is built with the Xe-HPC architecture and destined for the Aurora supercomputer.
Intel says that this is a necessary step to scaling up gaming graphics cards, and that this chip was specially optimised for "gaming first".
Zooming out a touch from the Xe-core and you'll find a Render Slice, which contains four of these Xe-cores clusters, a fixed function unit for DirectX 12 Ultimate support, and four new ray tracing acceleration units. Then a set of eight Render Slices will share access to L2 cache.
These new Intel Ray Tracing Units are designed to accelerate ray traversal, bounding box intersection, and triangle intersection. This sounds a lot like the RT Cores inside Nvidia's Ampere GPUs, and the Ray Accelerators AMD has inside its RDNA 2 architecture. Though we don't know how they stand up against each other in raw performance terms.
Again, that overall make-up is different to Ponte Vecchio on the Xe-HPC architecture, which has fewer engines per Xe-core but a greater number of Xe-cores per Render Slice. There's 16 Xe-cores in every HPC slice with just as many ray tracing units. A set of four Render Slices then share access to L2 cache, HBM2e controllers, and a media engine. So on and so fourth.
There are many levels to Ponte Vecchio that I dare not cover here.
Alchemist will be built on TSMC's 6N process, however, while Ponte Vecchio will use a mix of TSMC's N5, N7, and Intel 7 process nodes—depending on which tile you look at on the Ponte Vecchio package. Essentially, these are altogether very different chips.
What we're seeing, though, is proof of Intel's promised tweaks to the Intel Xe architecture, dependent on how they are intended to be used.
The gaming architecture is substantially different to the datacentre one, and both are estranged from the lower-spec Xe-LP chips in today's laptops. This sort of segmentation isn't an entirely new concept, both Nvidia and AMD's data centre cards look quite different to their gaming ones, but it's still exciting to see how another company's engineers are tackling GPU design and usage in a new fashion.
Intel also confirms that beyond even the configuration of an Xe-core or Render Slice, it has worked on optimisations and methodology to increase the power efficiency of its upcoming graphics cards—these include tweaks to the architecture itself, memory, and the card's physical design.
Considering all the optimisations Intel has made and the disparities in process nodes, the Xe-HPG architecture reportedly offers roughly 1.5x the performance/watt and runs at roughly 1.5x the frequency of Intel's Xe-LP chips.
While it's felt we've had a fairly good grip on what was expected out of DG2 even prior to Intel's Architecture Day, in regards to core counts and the specifics of the Xe-HPG architecture, perhaps we didn't have quite as good a grip on it as we suspected. After all, looking at what we know now, we were comparing apples to oranges between Intel Xe microarchitectures.
Intel's Alchemist graphics cards are coming early next year, and sometime before then we're likely to hear more about the specifications for the actual cards themselves and how they perform versus the competition. That's when we'll find out if all this tinkering with the Xe architecture really translates into frame rates.