Tale of the tape, analysis
We fully understand that the architecture is a lot to take in. We've thus far not divulged too much about the frequencies, which is what really makes a card, so we've lined up the two best single-GPU cards from NVIDIA and AMD. We'll refer to the architecture when discussing the figures.
GPU | GeForce GTX 680 (2,048MB) | GeForce GTX 580 (1,536MB) | Radeon HD 7970 (3,072MB) |
Radeon HD 7950 (3,072MB) |
---|---|---|---|---|
DX API | 11.1 | 11 | 11.1 | 11.1 |
Process | 28nm | 40nm | 28nm | 28nm |
Transistors | 3.54bn | 3.0bn | 4.3bn | 4.3bn |
Die Size | 294mm² | 520mm² | 352mm² | 352mm² |
Processors | 1,536 | 512 | 2,048 | 1,792 |
Texture Units | 128 | 64 | 128 | 112 |
ROP Units | 32 | 48 | 32 | 32 |
GPU Clock (MHz) | 1,006+* | 772 | 925 | 800 |
Shader Clock (MHz) | 1,006+* | 1,544 | 925 | 800 |
GFLOPS | 3,090 | 1,581 | 3,789 | 2,867 |
Memory Clock (MHz) | 6,008 | 4,008 | 5,500 | 5,000 |
Memory Bus (bits) | 256 | 384 | 384 | 384 |
Max bandwidth (GB/s) | 192.3 | 192.4 | 264 | 240 |
Power Connectors | 6+6 | 8+6 | 8+6 | 6+6 |
TDP (watts) | 195 | 244 | 250 | 200 |
GFLOPS per watt | 15.84 | 6.32 | 15.15 | 14.34 |
Release MSRP | $499 | $499 | $549 | $449 |
All but the GTX 580 support the very latest, incremental bump to Microsoft's DX11 API, and while it looks good on the spec. sheet, there's little real-world benefit to be derived from it.
Remember how we said that NVIDIA has purposely opted for more transistors operating at a lower speed, ostensibly for reducing board power? It's therefore impressive that GeForce GTX 680, aka Kepler GK104, has only 3.54bn transistors, a 15 per cent increase over Fermi. NVIDIA has clearly reworked the chip in other areas that it's not disclosing to the press, for we'd have guesstimated a 4bn-plus chip based on the architecture analysis.
28nm is the saviour
And, really, GTX 680 is only made absolutely viable by a 28nm process. It is the only way to go in order to manufacture a 3bn-plus transistor chip on a die that's economically feasible, even at the high-end of the graphics-card market. In the main, transistors are transistors, and it's a testament to the die-reducing properties of the 28nm fabrication that GTX 680 has a die which is 43 per cent smaller than GTX 580's, albeit with 15 per cent more transistors. GeForce GTX 680, quite simply, absolutely needed manufacturing partner TSMC to get its newest process working properly, perhaps explaining Kepler's slight delay from previous roadmaps.
Recall that GTX 680's 3x core count doesn't mean it's 3x faster than the GTX 580; you know the cores operate at the same speed as the general GPU clock. This frequency is 1,006MHz. Think about this for a second and the GTX 680 begins to take shape. Its much-improved PolyMorph engine, though reduced in number, is actually more potent than GTX 580's because it's clocked higher. Likewise, the 2x texture units are also bolstered by a higher frequency.
GTX 680's 3x cores push out 2x the GTX 580's GFLOPS. Their numerical advantage is compromised by a lower frequency, but 3TFLOPS of single-precision throughput is hardly anything to sniff at. NVIDIA wouldn't confirm the double-precision rate, but we reckon it'll remain at Fermi's 1/8th of SP.
As described on the on previous page, 32 ROPs and a 256-bit memory-bus don't seem like adequate bedfellows for the much-improved GPC units - heck, the GTX 580 has a 384-bit bus and 48 ROPs. Keeping both costs and power in check, NVIDIA's uses 6GHz-clocked GDDR5, meaning the increased frequency over GTX 580 cancels out the bus-width deficit. NVIDIA also knows that potential 192.3GB/s bandwidth is, well, rather low for a range-topping card, especially as competitor AMD uses a 384-bit bus on its HD 7900-series cards, but NVIDIA, in a roundabout way, reckons it is how you use the bandwidth that matters most.
Sub-200W TDP
What's undeniably impressive is the sub-200W TDP on all shipping GTX 680s, helped by the 28nm process and general energy-saving improvements that are made with each iteration of GPU, especially with the readjustment of cores and frequencies. NVIDIA actually beats AMD in the theoretical GFLOPS-per-watt metric. A low-ish TDP - well, for a top-end card, at least - also means the reference board ships with two 6-pin connectors, down from the 8+6 as used on the GTX 580 and Radeon HD 7970.
Due for release at $499 (£399, hopefully), thereby matching the Radeon HD 7970 (for the time being), the proof of this particular GPU pudding is in the benchmarking.
Though we've tread the murky waters of architecture analysis and GPU frequencies, there's one wrinkle that remains. The GTX 680's 1,006MHz core speed is the basic clock, and you'll know the 1,536 shaders also run at this speed. But it's not quite as simple as that, folks, because NVIDIA brings some new mojo to the table with something rather cool called GPU Boost. Flick on over to see what it's all about.