Texture Performance
One of the major facets of a modern graphics accelerator that defines 3D performance is how the chip can texture. Mainly a function of clock speed and the address and sampler units (and caches and the driver and......), the ability to read and write from a texture (in a range of formats, compressed in a range of ways) is crucial to almost all rendering techniques working, and working well.Multitexture Fillrate
We first started measuring multitexturing rates in the ATI Radeon X1900 XT and XTX review, and then we saw the NVIDIA GeForce 7800 GTX 512 maintaining it's fillrate as texture layers increase, and we see the new chips doing the same. The 7600 GT falls off a bit harder with 4 layers, but in general all the NVIDIA SKUs keep their texturing performance better.
Floating Point Texture Bandwidth
We use the same texture read test as with the R580 analysis, asking the hardware to read sequentially from an FP16 surface. As HDR rendering and general purpose programming becomes more usable on the GPU, FP texture read (and write) performance becomes more important.The test samples from a texture that doesn't fit into the texture cache.
G71 is better than G70 by some distance, despite there apparently being no cache or sampler specific work being done for the chip's technology refresh. Notice the difference between the two G71 SKUs, indicating performance is scaling with GPU core clock after a certain point.
Texture Fetch Latency
With small textures (the entire texture fits in the local texture cache of all the chips on test), the NVIDIA hardware is ahead.
As texture size increases, R520 and R580 hang on to the cached performance better, NVIDIA's chips falling back. G71 is faster than G70 at the same clocks, regardless of memory bandwidth available.