Kepler... on a diet
Hands up if you want a GeForce GTX 680? Hands up if you can afford the £400 asking price? Thought so. NVIDIA's best gaming silicon has been out since March this year and the firm has milked the full-fat Kepler GK104 die for all it's worth... and then some. The recent introduction of the GeForce GTX 660 Ti - the third-best single-GPU Kepler card - reinforces NVIDIA's desire to make use of practically every wafer out of the fab.
Yet there inevitably comes a time when the premium silicon has to be cut to form a smaller, more-cost-effective die; there's only so much castrating one can do on a large die before subsequent snips turn it into an uneconomic venture. Bringing this Kepler goodness down to lower price points is the job of the GK106 silicon whose first retail interpretation is to be the card in for review today, the GeForce GTX 660.
GK106, GK104, gee whizz, what does it all mean? Let's line up the four premium GPUs from NVIDIA's GeForce GTX 600-series line and explain.
GPU | GeForce GTX 680 (2,048MB) |
GeForce GTX 670 (2,048MB) |
GeForce GTX 660 Ti (2,048MB) |
GeForce GTX 660 (2,048MB) |
---|---|---|---|---|
Die codename | Kepler GK104 | Kepler GK104 | Kepler GK104 | Kepler GK106 |
DX API | 11.1 | 11.1 | 11.1 | 11.1 |
Process | 28nm | 28nm | 28nm | 28nm |
Transistors | 3.54bn | 3.54bn | 3.54bn | 2.54bn |
Die Size | 294mm² | 294mm² | 294mm² | 221mm² |
SMX units | 8 | 7 | 7 | 5 |
Processors | 1,536 | 1,344 | 1,344 | 960 |
Texture Units | 128 | 112 | 112 | 80 |
ROP Units | 32 | 32 | 24 | 24 |
GPU Clock (MHz) | 1,006 (1,058) | 915 (980) | 915 (980) | 980 (1,033) |
Shader Clock (MHz) | 1,006 (1,058) | 915 (980) | 915 (980) | 980 (1,033) |
GFLOPS | 3,090 | 2,459 | 2,459 | 1,882 |
Memory Clock (MHz) | 6,008 | 6,008 | 6,008 | 6,008 |
Memory Bus (bits) | 256 | 256 | 192 | 192 |
Max bandwidth (GB/s) | 192.3 | 192.3 | 144.2 | 144.2 |
Power Connectors | 6+6 | 6+6 | 6+6 | 6-pin |
TDP (watts) | 195 | 170 | 150 | 140 |
GFLOPS per watt | 15.84 | 14.46 | 16.39 | 13.44 |
Current MSRP | $499 | $399 | $299 | $199 |
Right-o, the new baby, GeForce GTX 660 (GK106 die), is shown on the far right. The three left-most GPUs are, you guessed it, the existing GK104 variants. Haphazardly interchanging retail and code-names, GK106 is based on a genuinely different piece of silicon than its GTX 600 brethren. Imbued with 2.54bn transistors on a 221mm²-sized die it is around 25 per cent smaller than GK104. This is good news for NVIDIA and you, the gamer, because it enables cheaper, power-efficient cards to be built.
Understand, however, that while the underlying architecture between the two is practically the same, the GK106, which has a smaller die, has to lose out somewhere. That loss is encountered on what we call the 'top end' of the GPU, where the shaders, texture units and tessellation units are located. GK106 drops its SMX complement to five, from a possible eight, and consequently has 960 shaders and 80 texture units, compared with GTX 660 Ti's 1,344 and 112.
NVIDIA patently realises that such a chop ain't gonna play well if games require masses of shading and texturing. Ameliorating this shading/texturing shortfall somewhat, the green team clocks the GK106 in at 980MHz core, boosting to an average of 1,033MHz, though we suspect the actual GPU Boost will be somewhat higher.
While the top section of the GPU is stunted in relation to its bigger brother, the back-end is, for all intents and purposes, a copy. 6Gbps memory interfaces with a 192-bit bus and produces, at full gas, 144.2GB/s of juicy bandwidth. 24 ROPs take care of image-quality work and 384KB of L2 cache keeps everything chugging along nicely. We reckon that GK106 is a better-balanced architecture than GK104, as found on the GTX 660 - it has less of a back-end bottleneck, for starters.
Smaller die, lower power, right? NVIDIA assigns GK106 a 140W TDP but goes further and states average in-game power is likely to be around 115W. This should lead to retail cards with minimal cooling and, perhaps, even passively-cooled examples from some of the more adventurous add-in card (AIC) partners. 140W, too, requires cards to be furnished with a single six-pin auxiliary power connector.
NVIDIA claims that GeForce GTX 660 is a full-implementation model of the GK106 die but it is strange to have odd-numbered SMX units - five in this case - as the silicon-filling model. Let's take a look at the block diagram and see if we can suss it out.
Ah, NVIDIA is using two full GPC units - each with two SMX units - and then adding a half-GPC on top. The firm realises that having a third dual-SMX GPC increases die size and, most likely, doesn't increase performance in a linear fashion due to the mismatch between the top- and back-end of the GPU: we'll find out more in our upcoming benchmarks.
Looking right at the bottom, NVIDIA's flexible memory-controller implementation enables partners to use a wide range of memory sizes. The per-specification amount is 2GB, made possible on a 192-bit bus by using mixed-density modules, but there's no reason why partners can't equip boards with, say, 1.5GB or 3GB, which would mean using same-density/width chips.
Architected to perform well at high-quality settings at the ubiquitous 1,920x1,080 (1080p) resolution, the basic GeForce GTX 660 2GB cards will be made available for £179, rising to £200 for special-edition, overclocked designs. Casting an eye over to AMD's Radeons, this means NVIDIA's newest card has to do well against the rival Radeon HD 7850 and HD 7870.
On paper, then, NVIDIA seems to have struck an amenable compromise between the economics of die size and performance requirements for a mid-range card.