facebook rss twitter

NVIDIA GeForce 7900 GTX, 7900 GT and 7600 GT Preview

by Ryszard Sommefeldt on 9 March 2006, 14:05

Tags: Nvidia Geforce 7900 GTX, NVIDIA (NASDAQ:NVDA)

Quick Link: HEXUS.net/qaeyi

Add to My Vault: x

Fragment Shading Performance

A modern fragment shader unit in a Shader Model 3.0 graphics processor is geared around ADD and MADD issue.

vec3 ADD rate

vec3 ADD rate

With our own in-house instruction rate test, we explore the limits of the fragment hardware on each GPU. Each chip is able to reach its theoretical peak, which for all the GPUs on test is a function of GPU clock, fragment unit count, times two (for each of the ADD instructions the sub-ALUs can issue for each FP unit).

With a count of 96 sub units that are able to issue ADD instructions when not dependent on each other, the R580 SKUs are so far ahead of the others as to make it embarassing for NVIDIA's products, and ATI's own R520.

vec3 MADD rate

One of NVIDIA's architectural changes when creating G70 from their NV40 base was to make sub-unit 2 in the FP units able to issue a MADD instruction. NVIDIA argue that the MADD instruction is prevalent in shader code, and it also affords them revectorisation opportunities in their shader assembler to boot (as it does ATI). Therefore investigating MADD rate is prudent.

vec3 MADD rate

None of the drivers tested predicate the shader (as they did in a previous version of our test!), but it's also slightly disappointing to not be able to show any demonstrable repacking. All the hardware hits near its theoretical peak, and you can see the benefits of NVIDIA's FP ALU setup here. ATI can't issue a 2nd MADD per cycle, which helps 7600 GT show a peak rate higher than the 650MHz ATI Radeon X1800 XT.

While the X1800 XT would outrun 7600 GT without any real problem, as we'll show you and as should be apparent if you've been following the theoretical analysis up to just, the 7600 GT does show off what a well-clocked 12 FP unit GPU is capable of, with its particular setup.