facebook rss twitter

Review: NVIDIA's GeForce 6800 Ultra GPU

by Ryszard Sommefeldt on 14 April 2004, 00:00

Tags: NVIDIA (NASDAQ:NVDA)

Quick Link: HEXUS.net/qaxl

Add to My Vault: x

NV40

Let's do things the easy way here, with a table to peek at so we can discuss the details.

The GPUs to talk about are as follows: NVIDIA's outgoing NV38 that powers its current high end product, GeForce FX 5950 Ultra. ATI's R360 that powers its Radeon 9800XT product. And of course NVIDIA's NV40, the focus of this article and the GPU that powers the GeForce 6800 Ultra review sample.

NVIDIA NV40 ATI R360 NVIDIA NV38
Process 130nm @ IBM 150nm @ TSMC 130nm @ TSMC
Transistor Count 222M 107M 130M
Geometry Pipeline VS3.0 VS2.0 VS2.0+
Fragment Processor PS3.0 PS2.0 PS2.0+
Fragment Processor Setup 2 full ALU (not equal) each with 1 mini ALU, Fog ALU, per pipe 1 full ALU, 1 mini ALU, per pipe 1 full ALU, 2 small ALU, per pipe
Fragment Processor Precision FP32, FP16 FP24 FP16, FP32
Traditional Render Setup 16 x 1 8 x 1 4 x 2
Vertex Shaders 6 2 Adaptive array
Basic Texture Filtering Trilinear Bilinear Bilinear
Texture Filtering Bilinear, Trilinear, 16X Anisotropic Bilinear, Trilinear, 16X Anisotropic Bilinear, Trilinear, 8X Anisotropic
Antialiasing Multi-sampling Multi-sampling Multi-sampling and super-sampling
AA Sample Type Rotated grid up to 8X with supersampling combined at 8X Scattered/sparse grid, up to 6X Ordered grid, up to 4X, up to 8X with super sampling
Bus Support AGP8X AGP8X AGP8X
Memory support GDDR3 DDR, DDR2 DDR, DDR2
Basic Core Frequency 400MHz 412MHz 475MHz
Basic Memory Frequency 1100MHz 730MHz 950MHz
Memory Bus Width 256-bit, memory crossbar 256-bit, memory crossbar 256-bit, memory crossbar
Basic Pixel Fillrate 6400Mpixel/sec 3296Mpixel/sec 1900Mpixel/sec
Basic Multitexture Fillrate 6400Mtexel/sec 3926Mtexel/sec 3800Mpixel/sec
Basic Memory Bandwidth ~35.2GB/sec ~23.4GB/sec ~30.4GB/sec

The basic specs give you an initial theoretical performance picture, with NV40's improvements over NV3x quite clear. 16 basic pixel pipelines, 6 'fixed' vertex shader units, twice the shader horsepower, trilinear texture filtering as the default filtering method (more on that later), rotated grid multisampling for (hopefully) better AA quality, up to 16-sample angle-adaptive anisotropic filtering (available with trilinear throughout) and 32-bit precision throughout the entire gamut of processing functions are the easy ones to spot.

Shader Model 3.0 support in DirectX 9.0c is the other big feature addition, but more on that later.

Its basic performance figures and features seem like a decent jump over the previous high-end NVIDIA GPU. NVIDIA appear to have agreed with everyone else in observing NV3x's biggest weaknesses and have chopped out the bad bits wholesale, replacing them completely.

According to documentation, the shader units are completely new. Notice my emphasis in the data table above, when listing NV40's fragment processor precision options. With NV38 I listed FP16 first, the optimal mode for the GPU to extract performance from.

With NV40, full 32-bit floating point precision everywhere is what's stressed. On top of that emphasis, the GPU gains features that NV3x doesn't have, like full 32-bit floating point filtering, blending and texture/surface support throughout. The NV40 can render to 32-bit multiple render targets finally (with some caveats, at least initially), something NVIDIA never properly implemented in NV3x. The improvements, especially in terms of NV40's 32-bit floating point performance and support for 32-bit render targets, are a most welcome addition.

It appears like NVIDIA have done 32-bit floating point right with NV40, at least on the surface. Implementation details have spoiled previous parties, something we won't forget in a hurry.

Finally, with previous criticism of NVIDIA's texture filtering performance and quality, the implementation of trilinear filtering as the default texture filtering method, along with new angle-adaptive anistropic filtering, should help raise NV40's basic image quality up to a new level, something sorely needed.

All of the above will be examined in forthcoming pages. Firstly however, a look at the reference board sent to reviewers.