Technology improvements
The optimal method for highlighting the differences between the various cards, and the differences between the Radeons in particular, can be best shown in a table. I'll discuss the most pertinent points below.
Radeon 9800 Pro | Radeon 9700 Pro | GeForce FX Ultra | |
Manufacturing process | 0.15 | 0.15 | 0.13 |
VPU Speed | 380MHz | 325MHz | 500MHz |
Memory speed | 680MHz DDR-I | 620MHz DDR-I | 1000MHz DDR-II |
Memory interface | 256-bit | 256-bit | 128-bit |
Memory Bandwidth | 21.8GB/s | 19.8GB/s | 16GB/s |
Triangle Throughput | 380 MT/s | 325MT/s | 350 MT/s |
Pixel Fillrate | 3.04 GP/s | 2.6 GP/s | 4 GP/s |
AA Fillrate | 18.24 Billion/s | 15.6 Billion/s | 16 Billion/s |
Rendering Pipelines | 8 | 8 | 4* |
Textures Per Pipe | 1 | 1 | 2 |
Vertex Shader | 2.0+ | 2.0 | 2.0+ |
Vertex Shaders | 4 | 4 | Floating Point Array |
Vertex instructions | 65,280 | 1,024 | 65,280 |
Pixel Shader | 2.0 (F-buffer) | 2.0 | 2.0+ |
Pixel instructions | Unlimited | 64 ? | 1,024 |
Pixel precision | 96-bit (4 x 24-bit)? | 96-bit (4 x 24-bit) | 128-bit (4 x 32-bit) |
FSAA | 6x | 6x | 8x |
FSAA Method | Multisampling | Multisampling | Multisampling |
Bandwidth saving | HyperZ III+ | HyperZ III | LMA III |
Image enhancement | SmoothVision 2.1 | SmoothVision 2.0 | Intellisample |
AGP rates | 1x/2x/4x/8x | 1x/2x/4x/8x | 1x/2x/4x/8x |
Connections | TV-Out, VGA, DVI | TV-Out, VGA, DVI | TV-Out, VGA, DVI |
Display | 2x 400MHz DACs | 2x 400MHz DACs | 2x 400MHz DACs |
* - It's recently come to light that the GeForce FX, in certain circumstances, may run in a 4 x 2 formation.
From taking a close look at the detailed specifications, we see that a number of new features are present in the Radeon 9800 Pro that aren't in the Radeon 9700 Pro. I'll go through them with a bias towards explanations in a more layman way.
F-Buffer
SMARTSHADER, ATi's in-house name for its pixel and vertex processing, sees an upgrade from 2.0 to 2.1. NVIDIA stole a march on ATi with the complex pixel shader on the NV30. With the ability to execute a pixel shader with far more operations (more complex and therefore more life-like) on a single pass than the Radeon 9700 Pro, it could render complex pixel shaders with 100s of instructions without having to go back to the frame buffer, as the R300 would once the shader became long enough. This, though, would only be beneficial in situations where developers could write complex shaders. ATi, not to be outdone on this front, have added what they term an F-buffer. Put simply, it's a hardware addition that gives the R350 the ability to run a pixel shader with an unlimited number of instructions without having to go back to buffer and multi-pass. I'm sure that ATi will heavily tout this technology with a DirectX9++ tag.
Pixel shading complicated and long enough, probably compiled through OpenGL or a higher-level language, that would make the present Radeon 9700 Pro have to re-read and multi-pass is still past DirectX9 specifications. We're looking at OpelGL 2.0 to support such instructions. Still, it's nice to know that the Radeon 9800 Pro has this angle covered, whenever it may arise. We're working our way, on the hardware level, towards the complexity and quality of Pixar's RenderMan shaders, ones used in their animated feature film series. Vertex shading abilities have also taken a turn for the better now with up to 65,280 instructions up from 1024 on the Radeon 9700 Pro. I think ATi are simply trying to ensure that their very latest hardware gives games developers the best possible chance of creating far more realistic environments, as both advanced pixel and vertex shading allow the developer to write code that far more closely represents what our own eyes see in the real world. If you've seen some of the lighting, reflections and textures on some demos, you'll know that cinematic quality rendering in real-time is the ultimate goal.
SMOOTHVISION 2.1
Perhaps more important for us right here, right now is the refinement that has taken place to ATi's image enhancement features that come under the collective name of SMOOTHVISION. We already know that the Radeon 9700 Pro is an efficient performer once anti-aliasing (almost a random multi-sampling method) and adaptive anisotropic filtering are applied to games. The new, improved SMOOTHVISION 2.1 purportedly improves on the present SMOOTHVISION 2.0 by making the process that much more efficient, resulting in higher framerates. ATi claim that the newer SMOOTHVISION 2.1 uses a heavily tuned memory controller that's more efficient once higher degrees of anti-aliasing and anisotropic filtering settings are applied. I'll put this assertion to the test later.
HYPER Z III+
One of the largest bandwidth eaters is rendering what we don't see in any given image. The costs of un-eliminated overdraw can make even these 20GB/s+ bandwidth monsters crawl. ATI's present collective term for saving memory bandwidth, Hyper Z III, sees an upgrade too. With games becoming more complex every year, the use of shadows to mimic real-life behaviour is slowly becoming more prevalent. The improved Hyper Z III+ more effectively uses a stencil buffer (used for shadows) to compute whether an object will fall in the shadow of another object. Being able to better compute the movement of shadows, and their effect on surrounding objects, and then to discard what won't be seen at any given time is a taxing business. The new Hyper Z III+ should be better equipped to deal with it in forthcoming games. We'll have to wait until Doom3 with its reliance on precision lighting and realistic shadows to verify just how well it works, probably.
To quickly summarise if all of that is a little too much, the Radeon 9800 Pro releases the burden placed on the present Radeon 9700 Pro with respect to complicated and lengthy pixel shaders through the use of a proprietary F-buffer, accommodating shaders of unlimited length. Anti-aliasing and anisotropic filtering performance has been further improved via the use of an enhanced memory controller, and the upgraded bandwidth-saving HyperZ function now better functions in the presence of shadows and the problems they cause with possible overdraw implications. ATi also claim that the 9800 Pro can now do displacement mapping, floating point cube maps, and floating point 3D textures in hardware now.
More image enhancement discussion
Anti-aliasing still uses the same edge-only approach as found on the Radeon 9700 Pro. The card samples pixels to see if they contain more than one polygon (an edge). If they do, the card tries to blend in the colours surrounding the edge, thereby blurring the jagged-edge stair-like effect. The number of samples that the Radeon 9800 Pro can take at the sub-pixel level is still a maximum of 6 on an enhanced rotated grid approach. The reason why multisampling is so comparatively efficient is that it only samples colours on the edge. The colours inside aren't sampled, and are given the same colour as the original pixel. The 9800 Pro, much like the 9700 series, can also use its bandwidth-saving Z-buffer (compression) for the internal pixels. That's probably how ATi arrive at their colour compression ratio of 6:1.
Anisotropic filtering retains the adaptive approach. In a nutshell, this means that the card intelligently guesses at which level anisotropic filtering should be applied to certain surfaces. If a particular surface, due to the way it would be viewed, cannot take advantage of, say, 16-tap AF filtering, the card won't apply it. The standard Trilinear filtering and anisotropic filtering can both be applied simultaneously.
Almost 22GB/s of bandwidth, AGP 8x support, SMOOTHVISION gamma correction,, Truform 2.0 support, and a whole host of features should make it something of a stellar performer. Add to that the resident VIDEOSHADER and FULLSTREAM technologies to ensure it remains a multimedia solution. We'll look at performance soon enough.