facebook rss twitter

Review: NVIDIA GeForce FX5900 Ultra

by Tarinder Sandhu on 12 May 2003, 00:00 4.5

Tags: NVIDIA (NASDAQ:NVDA)

Quick Link: HEXUS.net/qark

Add to My Vault: x

Technology discussion

The NV30 was trumpeted as NVIDIA's performance saviour. Our brief discussion on the previous page highlighted some of the reasons why it failed to live up to general expectation. So what does the NV35 do right that the NV30 didn't, and what's still the same? We'll be looking at the differences on a hardware level.

8x1 or 4x2, does it matter ?

One of the most basic determinants of just how fast a card is ((without taking into account image enhancement features (antialiasing and anisotropic filtering)), is just what kind of fillrate power a card has. Fillrate power, itself, is a product of the core speed and number of pixel/rendering pipelines. You can further boost your fillrate for multi-texturing scenarios by specifying multiple texture units per rendering pipeline. Put simply, higher numbers equate to higher theoretical performance.

The FX5800 Ultra's arrival coincided with NVIDIA changing the way they described their high-end cards. NVIDIA claimed, whether accurately or not, that one shouldn't approach 3D performance in pure fillrate terms. Rather, they say, performance is also a function of immense pixel and vertex shading ability, ably supported by the DX9-busting FX5800U architecture. You see, all that did was cloud the issue of what exactly the 5800U (NV30) was. ATi, quite openly, state that their 9700/9800 cards employ a 8-pipeline rendering approach using only a single texture unit per pipe. That means it produces the same kind of output in single and multi-texturing scenarios.

The 5800Ultra and 5900 Ultra (NV35), it transpires, is a 4x2 design, but with a twist. It emulates ATi's 8 PPCC (pixels per clock cycle) approach in the majority of cases (that aren't relevant for today's games), but it drops down to 4 PPPC when running standard Colour + Z-rendering approach (that is relevant today). NVIDIA alleviate some of his handicap by running comparatively high core speeds. The NV35 runs at a stock 450MHz core, whereas the Radeon 9800 Pro manages 380MHz. The 500MHz core of the NV30 is still unsurpassed, though. Bottom line, a relatively high single-texturing fillrate and a massive multi-texturing fillrate beast.

Bandwidth is your friend

All the pixel-pushing power in the world won't really benefit a GPU if it cannot be supplied with enough memory bandwidth. Now today's high-end cards employ both antialiasing and anisotropic filtering. The need for consistently high FPS with massive AA and AF is where it's at. The 5900 Ultra uses sophisticated bandwidth-saving techniques in the form of improved Z, colour, and texture compression. However, even with these in place, the bandwidth requirements of high FSAA and anisotropic filtering still exact a heavy toll on usable bandwidth.

You'll remember that although the FX5800 Ultra had mind-numbing 1GHz memory modules, its 128-bit memory bus (4 x 32-bit balanced controllers) only gave it 16GB/s of theoretical bandwidth. ATi, on the other hand, chose to go for a slower module speed with a 256-bit wide memory bus (4 x 64-bit). The Radeon 9800's total theoretical bandwidth totalled around 21.7GB/s. The 5900 Ultra addresses the NV30's bus limitation by matching ATi's 256-bit interface. NVIDIA's reasoning for not implementing this on the NV30 lay with the complexities of marrying a 256-bit memory bus to a 0.13u manufacturing process. Therefore, even with 'only' 850MHz memory, the total bandwidth rises to an impressively high 27.2GB/s.Ā  The Achilles Heel of the NV30 was its lacklustre performance under heavy FSAA and AF load. This should be rectified in the NV35, really.

Anisotropic filtering and antialiasing

The NV35 also boasts what NVIDIA terms Intellisample HCT technology. NVIDIA claim that, when run in quality mode, anisotropic filtering is applied to all portions of the screen. This contrasts with ATi's implementation. Their anisotropic filtering algorithm not only decides which area of the screen require antialiasing (improving texture quality on a slope), but also decides on the degree of filtering that's to be applied. The NV35 can also replicate ATi's selective filtering using balanced or performance modes. NVIDIA also claim to use a better sampling pattern than ATI's rectangular aniso' sampling. Both formats use Trilinear filtering (8 texture samples) as standard. We don't really see the need to apply filtering to non-sloping surfaces. It's a little unclear as to whether the performance anisotropic setting drops down to bilinear filtering at any stage.

Antialiasing still uses the same multi-sampling method found on other cards. HCT technology adds to antialiasing performance by improving compression algorithms (colour, Z and texture) present in the NV35.

CineFX 2.0

NVIDIA were keen to exploit vertex and pixel shading as a determinant of overall performance. What they failed to inform the general public was just how powerful the vertex and pixels shaders were on the FX5800 Ultra. Now we hear that CineFX 2.0 harnesses twice the pixel shading power of its predecessor. So how much is this? Well, twice as much as a lot; we cannot apply a numerical number to 'a lot'.

Each company likes a flashy brand name for similar technology. ATI's 9800 Pro uses a stencil buffer to help minimize the workload incurred by calculating and drawing realistic shadows; a must for upcoming games. NVIDIA jump on the shadow bandwagon, under the name of CineFX 2.0, with their UltraShadow technology. This technology, they say, limits precious computational time to the areas of a scene that will be most affected by shadows; i.e. depth bound checking. In layman's' terms, this simply means that the card's ability to draw realistic, lifelike shadows (both soft and hard) has been given a helping hardware hand. There's more to it than that, obviously, but think of it as speeding up the computationally-intensive task of drawing accurate shadows. Sounds rather similar to ATI's. The goal? To make games more realistic.

Digesting the above may take a while, so in summary, the NV35 is a 0.13u GPU with up to 256MB of super-fast on-board TinyBGA DDR-1 RAM. A core speed of 450MHz and a 256-bit memory interface feeding RAM running at 850MHz, with a 4/2 rendering formation gives us a) a single-texturing fillrate of 1800MPixels/s, b) a multi-texturing filllrate of 3600MTexels/s, c) 27.2GB/s bandwidth. Further, improved antialiasing and anisotropic filtering algorithms should provide excellent results at high resolutions and heavy loads. A doubling of the vertex and shading ability, which can have up to 128-bit precision (read lifelike), and hardware-assisted shadow creation should keep it at the technological forefront. Phew.

A table on the following page should highlight the basic specification and differences between the NV35, NV30, and R350 (Radeon 9800 Pro).