Table time
So what is the Radeon HD 4870 X2? It is, of course, based on the Radeon HD 4870 GPU, so please head on over to our architecture look to see what makes that tick along.
What's important to note is that AMD designed the Radeon HD 4800-series with a specific die-size in mind, along with an associated budget, so rather than opt for the big-die approach that NVIDIA has undertaken with its GeForce GTX 280/260, AMD's gone for something that's less than half that size, and thus opening up the possibility of adding two GPUs on one board - a la Radeon HD 3870 X2.
Graphics cards | AMD Radeon HD 4870 X2 2048MB | AMD Radeon HD 4870 512MiB | AMD Radeon HD 4850 512MB | AMD Radeon HD 3870 X2 1024MB | AMD Radeon HD 3870 512MB | NVIDIA GeForce GTX 280 1,024MB | NVIDIA GeForce GTX 260 896MB | NVIDIA GeForce 9800 GX2 1,024MB | NVIDIA GeForce 9800 GTX+ 512MB | NVIDIA GeForce 9800 GTX 512MB |
---|---|---|---|---|---|---|---|---|---|---|
PCIe | PCIe 2.0 | |||||||||
GPU(s) clock | 750MHz | 750MHz | 625MHz | 825MHz | 775MHz | 602MHz | 576MHz | 600MHz | 738MHz | 675MHz |
Shader clock | 750MHz | 750MHz | 625MHz | 825MHz | 775MHz | 1,296MHz | 1,242MHz | 1,500MHz | 1,836MHz | 1,688MHz |
Memory clock (effective) | 3,600MHz | 3,600MHz | 1,986MHz | 1,802MHz | 2,250MHz | 2,214MHz | 1,998MHz | 2,000MHz | 2,200MHz | 2,200MHz |
Memory interface, and size, | 512-bit (2x 256-bit), 2,048MB, GDDR5 | 256-bit, 512MB, GDDR5 | 256-bit, 512MB, GDDR3 | 512-bit (2x 256-bit) 1,024MB, GDDR3 | 256-bit, 512MB, GDDR4 | 512-bit, 1,024MB, GDDR3 | 448-bit, 896MB, GDDR3 | 512-bit (2x 256-bit), 1,024MB, GDDR3 | 256-bit, 512MB, GDDR3 | 256-bit, 512MB, GDDR3 |
Memory bandwidth | 230GB/sec | 115GB/sec | 63.5GB/sec | 115.33GB/sec | 72.8GB/sec | 141.7GB/sec | 111.9GB/sec | 128GB/sec | 70.4GB/sec | 70.4GB/sec |
Manufacturing process | TSMC, 55nm | TSMC, 55nm | TSMC, 55nm | TSMC, 55nm | TSMC, 55nm | TSMC, 65nm | TSMC, 65nm | TSMC, 65nm | TSMC, 55/65nm | TSMC, 65nm |
Transistor count | 1,930M | 965M | 965M | 1,300M | 666M | 1,408M | 1,408M | 1,508M | 754M | 754M |
Die size | 520mm² (2 x 260mm²) | 260mm² | 260mm² | 384mm² (2x 192mm²) | 192mm² | 576mm² | 576mm² | 660mm² (2 x 330mm²) | 230mm² | 330mm² |
Double-precision support | Yes | Yes | Yes | Yes | Yes | Yes | Yes | No | No | No |
DirectX/ Shader Model | DX10.1, 4.1 | DX10.1, 4.1 | DX10.1, 4.1 | DX10.1, 4.1 | DX10.1, 4.1 | DX10, 4.0 | DX10, 4.0 | DX10, 4.0 | DX10, 4.0 | DX10, 4.0 |
Vertex, fragment, geometry shading (shared) | 1,600 FP32 scalar ALUs, MADD dual-issue (unified) | 800 FP32 scalar ALUs, MADD dual-issue (unified) | 800 FP32 scalar ALUs, MADD dual-issue (unified) | 640 FP32 scalar ALUs, MADD dual-issue (unified) | 320 FP32 scalar ALUs, MADD dual-issue (unified) | 240 FP32 scalar ALUs, MADD dual-issue + MUL (unified) | 192 FP32 scalar ALUs, MADD dual-issue + MUL (unified) | 256 FP32 scalar ALUs, MADD dual-issue + MUL (unified) | 128 FP32 scalar ALUs, MADD dual-issue + MUL (unified) | 128 FP32 scalar ALUs, MADD dual-issue + MUL (unified) |
Peak GFLOPS | 2,400 | 1,200 | 1,000 | 1,056 | 496 | 933 | 715 | 1,152 | 705 | 648 |
Data sampling and filtering | 80ppc address and 80ppc bilinear INT8/40ppc FP16 filtering, max 16xAF | 40ppc address and 40ppc bilinear INT8/20ppc FP16 filtering, max 16xAF | 40ppc address and 40ppc bilinear INT8/ 20ppc FP16 filtering, max 16xAF | 32ppc address and 32ppc bilinear INT8/FP16 filtering, max 16xAF | 16ppc address and 16ppc bilinear INT8/FP16 filtering, max 16xAF | 80ppc address and 80ppc bilinear INT8/40ppc FP16 filtering, max 16xAF | 64ppc address and 64ppc bilinear INT8/32ppc FP16 filtering, max 16xAF | 128ppc address and 64ppc bilinear INT8/64ppc FP16 filtering, max 16xAF | 64ppc address and 64ppc bilinear INT8/32ppc FP16 filtering, max 16xAF | 64ppc address and 64ppc bilinear INT8/32ppc FP16 filtering, max 16xAF |
Peak fillrate Gpixels/s | 24 | 12 | 10 | 26.4 | 12.4 | 19.264 | 16.128 | 19.2 | 11.8 | 10.8 |
Peak Gtexel/s (bilinear) | 60 | 30 | 25 | 26.4 | 12.4 | 48.16 | 36.864 | 76.8 | 47.2 | 43.2 |
Peak Gtexel/s (FP16, bilinear) | 30 | 15 | 12.5 | 26.4 | 12.4 | 24.09 | 18.432 | 38.4 | 23.6 | 21.6 |
ROPs | 32 | 16 | 16 | 32 | 16 | 32 | 28 | 32 | 16 | 16 |
Peak TDP (claimed) | 289 | 160 | 110 | 196 | 105 | 236 | 182 | 196 | 155 | 155 |
Power connectors (default clocked) | 8-pin + 6-pin | 6-pin + 6-pin | 6-pin | 8-pin + 6-pin | 6-pin | 8-pin + 6-pin | 6-pin + 6-pin | 8-pin + 6-pin | 6-pin + 6-pin | 6-pin + 6-pin |
Multi-GPU | CrossFire - two-board | CrossFire - four-board | CrossFire - four-board | CrossFire - two-board | CrossFire - four-board | SLI - three-board | SLI - three-board | SLI - two-board | SLI - three-board | SLI - three-board |
Outputs | 2 x dual-link DVI w/HDCP, HDMI 7.1 (native, on GPU) | 2 x dual-link DVI w/HDCP, HDMI 5.1 (native, on GPU) | 2 x dual-link DVI w/HDCP, native HDMI 5.1 (via S/PDIF) | |||||||
Hardware-assisted video-decoding engine | AMD UVD 2 - full H.264 and VC-1 decode, plus dual-stream decode | AMD UVD - full H.264 and VC-1 decode | NVIDIA's PureVideo HD - full H.264 decode and partial VC-1 decode, plus dual-stream decode | |||||||
Reference cooler | dual-slot | dual-slot | single-slot | dual-slot | dual-slot | dual-slot | dual-slot | dual-slot | dual-slot | dual-slot |
Retail price (default-clocked model) | £329 (estimated) | £170 (£199, 1GB version) | £115 | £199 | £89 | £279 | £189 | £199 | £139 | £129 |
Discussion
It really is worth repeating that Radeon HD 4870 X2 (R700) takes in two Radeon HD 4870s and puts them on one board that plugs into your motherboard via the usual x16 PCIe (2.0) connector. The card is chipset agnostic, as the magic is contained inside, so it will work on any board with a x16 PCIe slot.
The full-production model of the AMD Radeon HD 4870 X2 ships with a 2GB framebuffer, as opposed to the engineering sample's 1GB. AMD's representatives indicated that doubling the framebuffer provides tangible performance gains at ultra-high resolutions and image-quality settings, because the combined workload is such that it would be hampered by having just 512MB per GPU.
The two RV770XT GPUs (Radeon HD 4870) have access to their own, non-shared 1GB framebuffer through a 256-bit-wide bus. That's a total of 230GB/s of bandwidth, although, naturally, each GPU is able to tap half that.
The HD 4870 X2's two GPUs each hook up to a PCIe 2.0-supporting PLX bridge-chip with via a 10GB/s link, and both to the system via the bridge and another bi-direction link totalling 10GB/s. There's the standard 0.9GB/s GPU-to-GPU combined framebuffer info, too.
Frequencies, however, remain unchanged compared to the single-GPU design, so 750MHz for the core and shaders and an electrifying 3.6GHz for the GDDR5 memory.
Multi-GPU on multi-GPU
Under CrossFireX, AMD's multi-GPU technology, one can pair up two X2s for four-GPU rendering. The option of teaming an X2 with a standard HD 4870 exists, as well as adding a HD 4850 for a little extra grunt. We'd stick with the same base models - HD 4870 - and you should see a small gain in performance.