Rollin' the numbers
Let's roll out the table and make some sense of it all.
GPU | Radeon HD 5970 2,048MB | Radeon HD 6970 2,048MB | Radeon HD 6950 2,048MB | Radeon HD 6870 2,048MB | Radeon HD 5870 1,024MB | NVIDIA GeForce GTX 580 1,536MB | NVIDIA GeForce GTX 570 1,280MB | NVIDIA GeForce GTX 480 1,536MB | NVIDIA GeForce GTX 470 1,280MB |
---|---|---|---|---|---|---|---|---|---|
Transistors | 4.3bn | 2.64bn | 2.64bn | 1.7bn | 2.15bn | 3.0bn | 3.0bn | 3.0bn | 3.0bn |
Die size | 2 x 334mm² | 389mm² | 389mm² | 255mm² | 334mm² | 520mm² | 520mm² | 529mm² | 529mm² |
GPU | Cypress | Cayman | Cayman | Barts | Cypress | Fermi v2 | Fermi v2 | Fermi | Fermi |
General clock | 725MHz | 880MHz | 800MHz | 900MHz | 850MHz | 772MHz | 732MHz | 700MHz | 607MHz |
Shader clock | 725MHz | 880MHz | 800MHz | 900MHz | 850MHz | 1,544MHz | 1,464MHz | 1,401MHz | 1,215MHz |
Memory clock | 4,000MHz | 5,500MHz | 5,000MHz | 4,200MHz | 4,800MHz | 4,008MHz | 3,800MHz | 3,696MHz | 3,206MHz |
Memory interface | 256-bit x2, 2,048MB GDDR5 | 256-bit, 2,048MB GDDR5 | 256-bit, 2,048MB GDDR5 | 256-bit, 1,024MB GDDR5 | 256-bit, 1,024MB GDDR5 | 384-bit, 1,536MB GDDR5 | 320-bit, 1,280MB GDDR5 | 384-bit, 1,536MB GDDR5 | 320-bit, 1,280MB GDDR5 |
Memory bandwidth | 2 x 128GB/s | 176GB/s | 160GB/s | 134.4GB/s | 153.6GB/s | 192.4GB/s | 152GB/s | 177.4GB/s | 133.9GB/s |
Geometry |
1 | 2 | 2 | 1 | 1 | 4 | 4 | 4 | 4 |
DP speed |
1/5 | 1/4 | 1/4 | NA | 1/5 | 1/4 | 1/4 | 1/4 | 1/4 |
Shaders | 3,200 | 1,536 | 1,408 | 1,120 | 1,600 | 512 | 480 | 480 | 448 |
GFLOPS | 4,640 | 2,703 | 2,253 | 2,016 | 2,720 | 1,581 | 1,405 | 1,345 | 1,089 |
Texturing | 160ppc
bilinear 80ppc FP16 |
96ppc
bilinear 48ppc FP16 |
88ppc
bilinear 44ppc FP16 |
56ppc
bilinear 28ppc FP16 |
80ppc
bilinear 40ppc FP16 |
64ppc
bilinear 64ppc FP16 |
60ppc
bilinear 60ppc FP16 |
60ppc
bilinear 30ppc FP16 |
56ppc
bilinear 28ppc FP16 |
ROPs | 2 x 32 | 32 | 32 | 32 | 32 | 48 | 40 | 48 | 40 |
ROP rate | 37.1 | 28.2 | 25.6 | 28.8 | 27.2 | 37.1 | 29.3 | 33.6 | 24.28 |
GTexel/s INT8 | 116 | 84.48 | 70.4 | 50.4 | 68 | 49.4 | 35.1 | 42 | 33.99 |
FP16 rate | 58 | 42.24 | 35.2 | 25.7 | 34 | 49.4 | 35.1 | 21 | 17 |
Board power (TDP) | 244W | 250W | 200W | 151W | 188W | 244W | 219W | 250W | 215W |
HDMI | 1.3a | 1.4a | 1.4a | 1.4a | 1.3a | 1.4a | 1.4a | 1.4a | 1.4a |
Retail price | £425 | £299 | £225 | £190 | £190 | £389 | £259 | £300 | £180 |
That's a lot of numbers to digest, so we're going to bring back the high-level overview and explain what's going on.
There's little point in having an enhanced architecture if it's not accompanied by a healthy dollop of cores, texture-units and high frequencies. We want you to look at the Radeon HD 6970/50 GPUs in relation to the Barts-based HD 6870 and Cypress-hewn HD 5870.
Analysis - Radeon HD 6970
Going through Radeon HD 6970 first, the 880MHz core/shader clock is joined by the highest-ever speed we've seen from GDDR5 memory - a stonking 5.5GHz. AMD says it's achieved this by overhauling the memory controller. Still run through a 256-bit interface, total bandwidth is higher than HD 5870's and on a par with the leading GeForce's. Geometry throughput means that two triangles can be rastered per clock cycle, doubling Cypress and Barts.
We've described how AMD has gone for a VLIW 4D arrangement for the cores. They're now grouped into 24 SIMD engines, up from 20 on HD 5870, but the 16x4 setup means that the total number of ALU cores falls, because HD 5870 runs with a 16x5 design. The math tells us that HD 6970 has 24x16x4 ALUs, or 1,536. It's pretty surprising to see a drop in shaders from one generation to the next.
Fewer shaders also mean a lower pure GFLOPS rating, even with a raised core speed. That, too, is surprising, and one begins to wonder if HD 6970 is an improvement at all. However, the SIMD engines each hold four texture units, meaning that HD 6970 has 96, compared to HD 5870's 80. Break it down and the Cayman card has a similar ROP rate but improved texturing-filtering throughput. You win some, you lose some.
Board power runs to a meaty 250W, though we know that this can be controlled by the PowerTune hardware. Honestly, we're somewhat underwhelmed by the on-paper numbers from HD 6970. We'd expected more shaders and a higher GFLOPS rating from the single-GPU champ. AMD's focused on boosting geometry setup and refining the architecture, rather than giving it more inherent muscle, and this points to Cayman being very much a transitional GPU.
The numbers reinforce the very first slide in this review, that is, AMD is going to have a tough time taming the GeForce GTX 580 and 570 duo. With that in mind, we come to pricing, knowing that NVIDIA's GTX 500-series have performance hegemony.
AMD's nervous about Cayman pricing, so much so that it's steadily dropped the SRP in the run-up to launch. Should the £299 and £225 figures hold true, the cards will be pitched quite aggressively, though we'll need to evaluate the benchmark results first.
Analysis - Radeon HD 6950
As you know, the Radeon HD 6950 is a scaled-down version of HD 6970. It loses two SIMD engines and consequently eight texture units, plus it arrives with lower clocks. The figures tell us that it will perform at a 10 per cent deficit when compared to its bigger brother, which will put it in the crosshairs of the now-defunct GeForce GTX 470.
But HD 6950 is based on the same core as HD 6970. This means it has an identical-sized die that weighs in at 389mm² and contains 2.64bn transistors. The numbers make it the largest single-die AMD card to date - some 16 per cent bigger than Cypress - but, importantly for manufacturing costs, around one-third smaller than NVIDIA's Fermi. Assuming rather a lot, AMD should be able to undercut Fermi's pricing without hurting the bottom line too much.
Notice something else? Both the new Radeon card will ship with 2,048MB frame-buffers - double that of either the Radeon HD 5870 or HD 6870/50. This is the first time that we can remember AMD launching a performance consumer GPU with a standard 2GB memory. We believe it's a nod towards multi-monitor Eyefinity, where super-high resolutions impact upon local memory to a greater degree than playing on a single monitor, however large.
The upside of a large frame-buffer should be smooth gameplay once I.Q. and resolution are dialled all the way up. The downside, obviously, is cost, and we don't imagine that an additional 1GB of GDDR5 rated at 5,000MHz-plus is going to be cheap. We're intrigued to see if any add-in board partner engineers a cheaper 1GB card.
Pre-benchmark summary
The star performer from last year, Radeon HD 5870, has aged well. It stacks up nicely against the two Cayman GPUs, and while they're fundamentally better graphics processors from an architecture standpoint, they won't obliterate the HD 5870 in present-day benchmarks.
AMD's clearly found it difficult to engineer a larger-die GPU that'll stomp on its previous-generation's best cards. Cayman is a solid GPU, no doubt, but we were left hankering for more, especially after the class shown by Radeon HD 5870/50 last year. AMD appreciates this fact and prices the Cayman GPUs accordingly. Moving swiftly on, let's now take a look at the two cards and throw a few benchmarks their way.