Table time
So what exactly is the GeForce GTX 295? It's essentially two intermixed GeForce GTX 200-series GPUs side by side in a single-card solution, connected via internal SLI. But, oddly enough, the GPUs aren't replicas of either the GeForce GTX 260 or the GeForce GTX 280. Instead, we've a hybrid GPU built on a 55nm process with a few useful upgrades.
Let's take a look at the table to see how both team's range-topping GTX 200 and HD 4000-series compare.
Graphics cards | NVIDIA GeForce GTX 295 | NVIDIA GeForce GTX 280 1,024MB | NVIDIA GeForce GTX 260 896MB (new) | AMD Radeon HD 4870 X2 2,048MB | AMD Radeon HD 4870 512MiB | AMD Radeon HD 4850 X2 2,048MB | AMD Radeon HD 4850 512MB |
---|---|---|---|---|---|---|---|
PCIe | PCIe 2.0 | ||||||
GPU(s) clock | 576MHz | 602MHz | 576MHz | 750MHz | 750MHz | 625MHz | 625MHz |
Shader clock | 1,242MHz | 1,296MHz | 1,242MHz | 750MHz | 750MHz | 625MHz | 625MHz |
Memory clock (effective) | 1,998MHz | 2,214MHz | 1,998MHz | 3,600MHz | 3,600MHz | 1,986MHz | 1,986MHz |
Memory interface and size | 448-bit (per GPU), 1,792MB, GDDR3 | 512-bit, 1,024MB, GDDR3 | 448-bit, 896MB, GDDR3 | 512-bit (2x 256-bit), 2,048MB, GDDR5 | 256-bit, 512MB, GDDR5 | 512-bit (2x 256-bit), 2,048MB, GDDR3 | 256-bit, 512MB, GDDR3 |
Memory bandwidth | 223.8GB/sec | 141.7GB/sec | 111.9GB/sec | 230GB/sec | 115GB/sec | 127GB/sec | 63.5GB/sec |
Manufacturing process | TSMC, 55nm | TSMC, 65nm | TSMC, 65nm | TSMC, 55nm | TSMC, 55nm | TSMC, 55nm | TSMC, 55nm |
Transistor count | TBC | 1,408M | 1,408M | 1,930M | 965M | 1,930M | 965M |
Die size | TBC | 576mm² | 576mm² | 520mm² (2 x 260mm²) | 260mm² | 520mm² (2 x 260mm²) | 260mm² |
Double-precision support | Yes | Yes | Yes | Yes | Yes | Yes | Yes |
DirectX/ Shader Model | DX10, 4.0 | DX10, 4.0 | DX10, 4.0 | DX10.1, 4.1 | DX10.1, 4.1 | DX10.1, 4.1 | DX10.1, 4.1 |
Vertex, fragment, geometry shading (shared) | 480 FP32 scalar ALUs, MADD dual-issue + MUL (unified) | 240 FP32 scalar ALUs, MADD dual-issue + MUL (unified) | 216 FP32 scalar ALUs, MADD dual-issue + MUL (unified) | 1,600 FP32 scalar ALUs, MADD dual-issue (unified) | 800 FP32 scalar ALUs, MADD dual-issue (unified) | 1,600 FP32 scalar ALUs, MADD dual-issue (unified) | 800 FP32 scalar ALUs, MADD dual-issue (unified) |
Peak GFLOPS | 1,788 | 933 | 805 | 2,400 | 1,200 | 2,000 | 1,000 |
Data sampling and filtering | 160ppc address and 160ppc bilinear INT8/80ppc FP16 filtering, max 16xAF | 80ppc address and 80ppc bilinear INT8/40ppc FP16 filtering, max 16xAF | 72ppc address and 72ppc bilinear INT8/36ppc FP16 filtering, max 16xAF | 80ppc address and 80ppc bilinear INT8/40ppc FP16 filtering, max 16xAF | 40ppc address and 40ppc bilinear INT8/20ppc FP16 filtering, max 16xAF | 80ppc address and 80ppc bilinear INT8/40ppc FP16 filtering, max 16xAF | 40ppc address and 40ppc bilinear INT8/ 20ppc FP16 filtering, max 16xAF |
Peak fillrate Gpixels/s | 32.256 | 19.264 | 16.128 | 24 | 12 | 20 | 10 |
Peak Gtexel/s (bilinear) | 92.2 | 48.16 | 41.472 | 60 | 30 | 50 | 25 |
Peak Gtexel/s (FP16, bilinear) | 46.1 | 24.09 | 20.736 | 30 | 15 | 25 | 12.5 |
ROPs | 56 | 32 | 28 | 32 | 16 | 32 | 16 |
Peak TDP (claimed) | 289 | 236 | 182 | 289 | 160 | - | 110 |
Power connectors (default clocked) | 8-pin + 6-pin | 8-pin + 6-pin | 6-pin + 6-pin | 8-pin + 6-pin | 6-pin + 6-pin | 8-pin + 6-pin | 6-pin |
Multi-GPU | SLI - quad | SLI - three-board | SLI - three-board | CrossFire - two-board | CrossFire - four-board | CrossFire - two-board | CrossFire - four-board |
Outputs | 2 x dual-link DVI w/HDCP, 1 x HDMI | 2 x dual-link DVI w/HDCP, native HDMI 5.1 (via S/PDIF) | 2 x dual-link DVI w/HDCP, HDMI 7.1 (native, on GPU) | 4 x dual-link DVI w/HDCP, HDMI 7.1 (native, on GPU) | 2 x dual-link DVI w/HDCP, HDMI 7.1 (native, on GPU) | ||
Hardware-assisted video-decoding engine | NVIDIA's PureVideo HD - full H.264 decode and partial VC-1 decode, plus dual-stream decode | AMD UVD 2 - full H.264 and VC-1 decode, plus dual-stream decode | |||||
Reference cooler | dual-slot | dual-slot | dual-slot | dual-slot | dual-slot | dual-slot | single-slot |
The transition to half-node 55nm clearly has its benefits, including a maximum power draw of 289W - a feat we believe to be unachievable with two 65nm GeForce GTX 260s.
Armed with GTX 260-matching frequencies - that's 576MHz core, 1,242MHz shader and 1,998MHz memory - you'd be forgiven for believing nothing has changed. However, NVIDIA has raised the total shader count to 480, matching the per-GPU amount of the GeForce GTX 280.
Both GPUs have access to 896MB of GDDR3 memory via a 448-bit interface, creating a substantial 223.8GB/s of memory bandwidth - though, of course, each GPU can lay claim to only half.
Furthermore, the doubling-up process has the usual effect of taking numbers through the roof. ROP count rises to 56 and the 160 total texture filtering units help push the card's massive bilinear-filtering capacity to 92.2Gtexels/s.
As is the case with most dual-GPU single-card solutions, the GeForce GTX 295 fits into a single PCIe 2.0 interface, bringing SLI support to boards with just the one PCIe 2.0 slot. Should an extra PCIe 2.0 x16 slot be available, the hardened enthusiast will be happy to hear that a second GeForce GTX 295 can be partnered up to provide a return to Quad SLI performance - a feature last seen during the GeForce 9800 GX2's days.
With an estimated MSRP of Ā£400, however, it's asking a lot to consider even a single card, let alone two.
Despite the 55nm shrink, not a whole lot else has changed. The card's compute power has been beefed up - and we'd expect nothing less given NVIDIA's desire for GPGPU domination - but its memory subsystem sticks to GDDR3. It's seemingly subdued, but the die shrink could leave headroom for NVIDIA's partners to introduce pre-overclocked models further down the road.
Of course, one can't ignore the obvious possibility of 55nm versions of the GeForce GTX 280 and GeForce GTX 260, but we'll wait for official confirmation from NVIDIA before we begin to conjecture.