3DMark and being a Gamer's Benchmark
Futuremark have given 3DMark tag of Gamer's Benchmark since 3DMark 03 launched. While the tag is an admirable one, given that 3DMark seeks to rank graphics hardware and predict or report their overall gaming performance, it's an increasingly difficult thing for them to do given the problems outlined on the previous page.Hardware diversification for the DirectX 9.0 era has hurt that goal, especially with Shader Model 3.0. 3DMark06 was developed while only one vendor's Shader Model 3.0 parts were around. The second vendor has since caught up, but those parts don't comply with the Shader Model 3.0 specification in subtle ways, requiring the use of vendor-specific workarounds (for both vendors) to do, outside D3D, what the hardware doesn't comply with or supply inside the D3D9 API.
Specifically, I'm talking talking about PCF on NVIDIA parts, and vertex texturing via R2VB on ATI hardware.
Further, those parts offer substantially different performance profiles to those available from the other vendor, running the same shader code. While that's exactly the kind of thing that 3DMark06 should seek to measure, it doesn't do that inside the main benchmark and its graphics tests. Indeed, it doesn't really do that inside its feature tests either, which currently stand as the most attractive part of the benchmark.
When you have 'Shader Model 3.0' parts available that differ in available features (per chip family too!), how to implement those features inside and outside of a D3D, engineering a 3D benchmark that purports to be unbiased is incredibly difficult.
When all of that is compounded by hardware that's not even available while you develop, I wouldn't want Futuremark's problems.
Can you reasonably engineer a faithful and unbiased "Gamer's Benchmark" under those conditions? I'd argue that you reasonably can't. The compromises you have to make would be too polarised, and I think that's entirely visible with 3DMark06.
Let's analyse the specifics of how the conditions for 3DMark's development, self imposed or otherwise, mean it's a high-profile benchmark that ultimately misleads.