facebook rss twitter

Review: Futuremark 3DMark06 Analysis

by Ryszard Sommefeldt on 4 March 2006, 10:14

Quick Link: HEXUS.net/qaepx

Add to My Vault: x

Thoughts

It's quite clear that 3DMark06 in its current state is far from perfect, and I think the blame for that lies largely with Futuremark themselves. It's clear that they had no choice but to develop on NVIDIA's hardware, though, given the large SM3.0-focus of the work they put into it. ATI's hardware wasn't around, making Futuremark's choice to forge ahead on one vendor's hardware unavoidable.

That's entirely the fault of ATI having to ship late. However the position Futuremark put themselves in by heavily freezing how the benchmark works after shipping, isn't that of the game developers they seek to emulate. Game developers will be working with that hardware now to incorporate its specific features and performance profiles in to their games, where needed. Futuremark did a modicum of work days before launch to get Fetch4 in, and only after ATI put pressure on them to do so, and that was that.

A games developer can patch at the request of an IHV, all going well with the publisher and their Q&A team. However an IHV won't get anywhere with 3DMark06 when asked to add vendor-specific enhancements after it ships.

Futuremark, by locking 3DMark down to keep the ORB concept working, don't get the luxury of adapting to a changing landscape of hardware, developer awareness and new rendering techniques. It's not an organic development process after it ships to the consumer.

Indeed, it doesn't even seem like an organic process during development! Speaking to both NVIDIA and ATI with regards to their input to 3DMark06's development, it seems like both had comparatively little influence over how it turned out. Both paid their BDP fees, supplied hardware when able and offered developer relations support. But how much Futuremark took that devrel support seems questionable, and, on the surface of it at least, minimal.

BDP membership was opened at the beginning of the development cycle, IHVs joined and future directions for hardware and rendering techniques were discussed, then it seems as if Futuremark closed up shop in order to get the 05 evolution out of the door in time to meet a ship date.

Beta 2 was released very close to the gold release, giving none of the IHVs any real time to get in any specific fixes, adjustments or large changes before ship day. And since release it seems like what didn't make its way in during that short window will remain out of the benchmark completely.

In short, it seems that Futuremark had to engineer 06 quickly and meet a specific ship date, compromising not only the benchmark, but also Futuremark's prior stated goals and aims for their benchmark software. Locking down development of a major public 3D graphics project of the scope and influence a new 3DMark has, both before and after shipping, is precisely what prompted this article's creation.

It hasn't resulted in a benchmark that remains fair to their whitepaper statements of no bias and the highest quality they're able to produce. BDP interaction was at a minimum, clearly, and the technical and philosophical ramifications that's had on 3DMark06 are obvious.

That brings me back full circle to the influence a 3DMark product has. That lofty position of influence only works if the benchmark does, I wrote in the introduction to this piece. The benchmark doesn't in its current state, yet it seems likely the influence will continue.

That millions of dollars will be spent based on something as flawed as an 06 score seems a shame, and most of all for the consumer. The reliance on one number to barter for product prices, for boards to be placed into SKUs that cut corners enough as it is, is sad.

What can they do, going forward?

With Futuremark pushing the Gamer's Benchmark mantra more than they ever have, engineering 3DMark more in line with how a game developer engineers a game leaves Futuremark in a curious position.

It's my opinion that Futuremark can't reasonably expect to engineer a benchmark that acts just as a game does, without actually engineering a game itself. Their snapshot attempts at guessing what's coming can only work in a game scenario if they include far more tests and rendering techniques, and far more content than they ever have. To cover all those bases would require massive amounts of incredibly expensive work.

It would also require the BDP and IHV work during development they apparently didn't do this time around, and an organic way of working with post-shipping patch development and the unshiftable ORB. Closer and more flexible work with the IHVs and participating Media BDP members would have seen 06 come out much better off.

The reason why websites and print publications get hardware evaluation more right than any single benchmark does, is that they use as many real-world shipping games as is reasonable for them to do so, and benchmark in as many ways as they can in the scenarios they think work best, pairing all of that with theoretical testing work in order to get the final picture. That procedure tests a multitude of games and game content, in multiple scenarios on lots of hardware, using multiple rendering techniques with differing CPU interation. It's all about the size of the data set used to make the judgement.

Reliance on one single test that's so fixed after shipping never works when trying to emulate that situation, but 3DMark is positioned as being that One Test To Rule Them All. It deviates too far from being a tight, focussed, impressive-looking and taxing theoretical with minimal code path management, stuck to one API. It worked better, for me personally, when they engineered it like that.

It's a pain in my perky manbreasts that I undertake writing about something as touchy as a new 3DMark in this fashion, but it works if it makes people think about how 3DMark sustains Futuremark (perception wise and financially), and makes Futuremark think about how they use that kudos and money to sustain 3DMark in turn.

Fingers crossed they're entirely back on track with the D3D10 version, and fingers crossed the people who take the score as some shining gospel influence back off a touch to let them do so.


HEXUS Forums :: 3 Comments

Login with Forum Account

Don't have an account? Register today!
Was FM consulted on the technical critique, particularly their curious decisions as to where it was appropriate to spend dev time on vendor-specific optimizations and where not? If so, did they not care to comment?
during the course of creating this HEXUS article - ATi Technologies, Futuremark and NVIDIA were all offered the opportunity to contribute their thoughts and, as is the HEXUS policy, to also submit a formal response under the HEXUS.right2reply initiative.

my understanding is that all of these companies did helpfully engage with HEXUS during our preparation – and for this we’re very grateful - but, to date, only ATi Technologies has submitted formal written responses.

at the time of writing NVIDIA seems to have chosen not to respond to our invitation to submit its HEXUS.right2reply or comment openly on the published article, and the same may be true of Futuremark, but i’ve not personally been involved with correspondence with Futuremark so far.

we will of course update the article with any appropriate formal responses that are received.

cheers,

PD
Wil lthe Ati response be up soon ?

Regards

Andy