If you've been hankering after a graphics card upgrade lately but wanted to see what NVIDIA's response to AMD's new Radeons is, wait no more. The green team has taken the wraps off of their GeForce GTX 680, the company's latest flagship card which replaces the GTX 580 and formally introduces a new chip architecture – called Kepler – to the world.
This is decision time. Over the last three months AMD has pushed out it its line-up of Radeon HD7000 cards using an design that's both fater and uses less power than before. Now we have NVIDIA's response. Which will you buy?
Just as AMD did with its Graphics Core Next (GCN) architecture, NVIDIA's Kepler chips are a radical overhaul for the base design. It's not quite as big a break from the past as AMD is making, but it's close. Kepler has a broadly recognisable graphics pipeline to its predecessor, Fermi, while AMD's GCN was completely re-engineered from the ground up. The changes are major, though, and require some explanation as to what's going on.
Here's the key things to note about GTX 680:
And here's the after and before shots, comparing GTX 680 with its predecessor.
The most noticeable immediate change is that the stream shaders – the simple execution cores which do the parallel processing for pixel, vertex, geometry, physics and general compute calculations – no longer run at twice the clock speed of the GPU base. That immediately reduces power consumption across the whole card, even though the core clock speed has been increased to a massive 1.005GHz.
NVIDIA is able to take the speed hit, because while a GTX 580 had 512 cores running at 1.5GHz, the GTX 680 has an enormous 1532 cores turning over at 1.005GHz. At its most basic level, then, the GTX 680 is capable of twice as many calculations per cycle as the GTX 580.
That, of course, doesn't tell the whole story. The reason NVIDIA has managed to squeeze so many cores onto one die is that Kepler is the firm's first chip produced on a smaller 28nm process. Despite tripling the number of cores, the phsyical die size is about two thirds smaller than Fermi, and has just 500m more transistors (3.5billion compared to 3billion).
By grouping together Kepler cores in massive arrays of 192 processors which share common scheduling and memory structures, the amount of core logic inside Kepler hasn't risen enormously. In fact, the number of 'Polymorph Engine' units which handle operations like tesselation and vertex fetch has actually halved – from 16 in GTX 580 to eight in GTX 680.
It also has a narrower memory bandwidth – down from 384bits to 256bits. The theoretical throughput remains the same, however, thanks to the use of 6GHz equivalent GDDR5 RAM.
The marketing spiel makes a lot of the card's efficiency, and says that 'performance per watt' was one of the leading design goals. On the evidence so far, that goal has been largely met. I've not had a card to test, but colleagues who have say it outperforms the 250W Radeon HD7970 while drawing less than 195W maximum power.
On paper, it looks like these 192-big groupings of cores (Fermi arranged its shader processors in much smaller blocks of 32) might mean there's a problem with keeping all of them working at full capacity all of the time, since there's less flexibility for more simple instructions. NVIDIA has come up with a clever trick which seems to be about keeping the chip running as close to flat out as possible. It's introducing an automatic overclocking mode, known as GPU Boost, that's similar to Intel's Turbo Boost for its CPUs. Whenever the chip isn't operating at its maximum power consumption and thermal tolerance, it will accelerate in increments of 13MHz until it is. So if only 50% of the shaders are firing (as they were in a Battlefield 3 demo I saw) the other 50% will speed up.
Initially, the amount of overclocking done by the card is conservative. By default, the chip will only accelerate up to 1.058GHz, an increase of 5%. That's not going to have a massive effect on framerates. All of the variables that control GPU Boost – maximum power draw, base speed, boost speed – can be altered with driver tools, though, so you can tune your card for better performance.
At the other end of the scale, the card will also downclock itself when running idle, although there's no ability to completely shut down areas of the chip to save power.
In terms of actually playing games, even the last generation of top end hardware was considerably more advanced than the game engines running on it – performance is good, but tough to get too excited about at this price. To try and make GTX680 more interesting, then, there's the power efficiency and NVIDIA is also introducing a experimental new anti-aliasing mode called Temporal Anti-Aliasing, or TXAA. This uses dithering to change the colour of a pixel per frame and thus create a blended edge, rather than the more traditional interpolated colours of MSAA and FXAA modes.
More intriguing is the new Adaptive Vsync. Enable this in the drivers, and Vsync will turn on when the framerate is higher than the monitor can display, thus preventing tearing, but turn it off when the framerate drops below the monitor's speed. This, NVIDIA argues, prevents frame stuttering as images go out of sync for a split second.
NVIDIA says that today is a hard launch for GTX 680, which means you should be able to go out and buy one straight away if you wish. I've not had chance to review Kepler yet, but at the MSRP of £429/$499 it looks like it should have the edge over similarly priced cards from AMD.
What I'm really waiting for, however, is Kepler to hit lower prices. I'd hold off an upgrade until we see what the GTX 660 brings, personally, as that could well be the card to get.