NVIDIA G71
G71 is NVIDIA's new high-end graphics product. While your author was convinced the new high-end GPU would take at least the pixel shading hardware wider, compared to G70, the reality is different (and somewhat humbling!). G71 has the same VP/FP/ROP configuration as G70, placing 8 vertex processors (VP), 24 fragment processors (FP) and 16 raster ouput units (ROP) on a 196mm² 90nm die.Significantly smaller than ATI's R520, we've mentioned that G71 isn't simply a shrunken G70. At 278M transistors compared to its 302M transistor parent ASIC, NVIDIA lose the transistors via the shrink to 90nm (new libraries) and a slight repipelining of the vertex and fragment hardware, trimming the fat from those.
25 million transistors might seem like a fair chunk to give up (it's a full GeForce 2 for crying out loud!), especially while improving performance further in some areas, but given up they are, the rapidly switching silicon devices trotting off to Render Heaven, leaving NVIDIA with a smaller, cheaper chip to make. And that's without a shrink to 90!
A good friend of mine pointed out recently that 196mm² means G71 is smaller than ATI's ground breaking R300. Silicon trivia at best, but that's not bad for over 10 times the fragment processing power of Radeon 9700 Pro from a G71 at GeForce 7900 GTX clocks.
So given an 8/24/16 setup, let's sum up a formal spec before looking at the theoretical rates of the two new G71-based SKUs launching along with the chip itself.
NVIDIA G71 GPU Properties | ||
---|---|---|
GPU | NVIDIA G71 | |
Process and Fabricator | 90nm @ TSMC, 90GT w/ low-k | |
Die Size | 13.5 x 14.5mm (196mm²) | |
Transistor Count | 278 million | |
DirectX Shader Model | Shader Model 3.0 | |
Basic Configuration (VP/FP/ROP) | 8/24/16 | |
Vertex Shader Info | VS3.0 5D FP32, co-issue MADD, branch, tex |
|
Fragment Processor Info | PS3.0 4D FP32, dual and co-issue MADD/MADD, branch, tex |
|
ROP Info | 4x FX8/FX16 MSAA (2 subsamples/cycle) 2x Z-only rate, 2x colour-only rate FP/FX blender (inc. FP16) |
|
Texture processing | 24 FP32 address units, 24 samplers Bi/Tri/Aniso (128-tap), FP16 filtering |
|
Memory Interface | 256-bit, 4 memory channels GDDR->GDDR3 |
|
GPU | ||
Display output | 2x dual-link DVI TMDS transmitters, NVIDIA PureVidio |
Versus G70, the functional changes are subtle. Better FP16 blending performance, double colour rate, SLI AA resolve via the inter-GPU link, two dual-link TMDS transmitters and the move to 90nm are the main differences.
Your author was expecting a wider chip in terms of fragment processing, and indeed NVIDIA did work on an 8-quad design that never taped out. Despite that, the clock rates afforded to G71 by TSMC's 90GT process mean that even with only 24 fragment units, their dual MADD design is just as much a fragment processing beast as ATI's R580 in X1900 XTX configuration, at least in terms of that MADD instruction.
Theoretical Performance
Theoretical Rates for GeForce 7900 GTX and GT | ||
---|---|---|
NVIDIA GeForce 7900 GTX | NVIDIA GeForce 7900 GT | |
Core Clock | 650MHz (700MHz VS) | 450MHz (470MHz VS) |
Memory Clock | 800MHz | 660MHz |
Pixel fillrate | 10.4G pixels/sec | 7.20G pixels/sec |
Texture sampling rate | 62.4G samples/sec | 43.2G samples/sec |
Z-only fillrate | 20.8G samples/sec | 14.4G samples/sec |
Vertex transform rate | 1.40G tris/sec | 0.94G tris/sec |
VP MADD issue rate | 5.60G instr/sec | 5.00G instr/sec |
FP MADD issue rate | 31.2G instr/sec | 21.6G instr/sec |
Memory bandwidth | 56.32 GB/sec | 42.24 GB/sec |
NVIDIA don't disable any part of the chip to create the GT. It's simply a clock reduced GTX in terms of configuration, with half the memory. That indicates yeilds of the G71 GPU, shipping in A02 silicon form, are excellent.
Let's have a look at G73 before we look at the reference boards for each chip supplied by NVIDIA for evaluation recently.