NVIDIA GeForce 7900 GTX, 7900 GT and 7600 GT Preview

NVIDIA G71

G71 is NVIDIA's new high-end graphics product. While your author was convinced the new high-end GPU would take at least the pixel shading hardware wider, compared to G70, the reality is different (and somewhat humbling!). G71 has the same VP/FP/ROP configuration as G70, placing 8 vertex processors (VP), 24 fragment processors (FP) and 16 raster ouput units (ROP) on a 196mm² 90nm die.

Significantly smaller than ATI's R520, we've mentioned that G71 isn't simply a shrunken G70. At 278M transistors compared to its 302M transistor parent ASIC, NVIDIA lose the transistors via the shrink to 90nm (new libraries) and a slight repipelining of the vertex and fragment hardware, trimming the fat from those.

25 million transistors might seem like a fair chunk to give up (it's a full GeForce 2 for crying out loud!), especially while improving performance further in some areas, but given up they are, the rapidly switching silicon devices trotting off to Render Heaven, leaving NVIDIA with a smaller, cheaper chip to make. And that's without a shrink to 90!

A good friend of mine pointed out recently that 196mm² means G71 is smaller than ATI's ground breaking R300. Silicon trivia at best, but that's not bad for over 10 times the fragment processing power of Radeon 9700 Pro from a G71 at GeForce 7900 GTX clocks.

So given an 8/24/16 setup, let's sum up a formal spec before looking at the theoretical rates of the two new G71-based SKUs launching along with the chip itself.

NVIDIA G71 GPU Properties
GPU	NVIDIA G71
Process and Fabricator	90nm @ TSMC, 90GT w/ low-k
Die Size	13.5 x 14.5mm (196mm²)
Transistor Count	278 million
DirectX Shader Model	Shader Model 3.0
Basic Configuration (VP/FP/ROP)	8/24/16
Vertex Shader Info	VS3.0 5D FP32, co-issue MADD, branch, tex
Fragment Processor Info	PS3.0 4D FP32, dual and co-issue MADD/MADD, branch, tex
ROP Info	4x FX8/FX16 MSAA (2 subsamples/cycle) 2x Z-only rate, 2x colour-only rate FP/FX blender (inc. FP16)
Texture processing	24 FP32 address units, 24 samplers Bi/Tri/Aniso (128-tap), FP16 filtering
Memory Interface	256-bit, 4 memory channels GDDR->GDDR3
GPU
Display output	2x dual-link DVI TMDS transmitters, NVIDIA PureVidio

Versus G70, the functional changes are subtle. Better FP16 blending performance, double colour rate, SLI AA resolve via the inter-GPU link, two dual-link TMDS transmitters and the move to 90nm are the main differences.

Your author was expecting a wider chip in terms of fragment processing, and indeed NVIDIA did work on an 8-quad design that never taped out. Despite that, the clock rates afforded to G71 by TSMC's 90GT process mean that even with only 24 fragment units, their dual MADD design is just as much a fragment processing beast as ATI's R580 in X1900 XTX configuration, at least in terms of that MADD instruction.

Theoretical Performance

Theoretical Rates for GeForce 7900 GTX and GT
	NVIDIA GeForce 7900 GTX	NVIDIA GeForce 7900 GT
Core Clock	650MHz (700MHz VS)	450MHz (470MHz VS)
Memory Clock	800MHz	660MHz
Pixel fillrate	10.4G pixels/sec	7.20G pixels/sec
Texture sampling rate	62.4G samples/sec	43.2G samples/sec
Z-only fillrate	20.8G samples/sec	14.4G samples/sec
Vertex transform rate	1.40G tris/sec	0.94G tris/sec
VP MADD issue rate	5.60G instr/sec	5.00G instr/sec
FP MADD issue rate	31.2G instr/sec	21.6G instr/sec
Memory bandwidth	56.32 GB/sec	42.24 GB/sec

NVIDIA don't disable any part of the chip to create the GT. It's simply a clock reduced GTX in terms of configuration, with half the memory. That indicates yeilds of the G71 GPU, shipping in A02 silicon form, are excellent.

Let's have a look at G73 before we look at the reference boards for each chip supplied by NVIDIA for evaluation recently.

NVIDIA GeForce 7900 GTX, 7900 GT and 7600 GT Preview

NVIDIA G71

Theoretical Performance

MY HEXUS

EVENTS

INDUSTRY PRESS RELEASES