NVIDIA's PhysX being hobbled on CPUs?
by Pete Mason
on 9 July 2010, 08:51
Tags:
NVIDIA (NASDAQ:NVDA)
Quick Link: HEXUS.net/qay2u Add to My Vault: |
|
Living in the past
While PhysX can be run on CPUs, NVIDIA claim a two-to-four times performance increase when using one of its GPUs. This is why hardware physics acceleration using a graphics card allows for a greater number of more complex objects to appear in games. However, a detailed report from Real World Technologies has shown that when the physics code is run on a CPU, it relies on the x87 instruction set. While these instructions are supported by modern processors, chip manufacturers have been discouraging developers from using them for almost a decade in favour of SSE2, which is much more efficient.
According to the author, the newer CPU-based instructions could perform the same physics calculations at least twice as fast. This, combined with the fact that PhysX middleware tends to be single-threaded on CPUs – despite being massively multi-threaded when run on graphics cards – means that the performance delta wouldn’t be nearly as wide if the code were to be recompiled. While it may seem like a colossal task to rewrite the code to use SSE2, the report suggests that, with a bit of diligence, it could be updated in “about a day or two”. This means that updated, multithreaded code running on a modern quad- or hexa-core processor could, in theory, come close to matching physics acceleration by a GPU.
If it ain’t broke...
So why would NVIDIA continue to use outdated instructions? There doesn’t seem to be any valid practical reasons - like maintaining backward compatibility or because of increased precision - so the report draws the only valid conclusion: “PhysX uses x87 because Ageia and now Nvidia want it that way”.
“In the case of PhysX on the CPU, there are no significant extra costs (and frankly supporting SSE is easier than x87 anyway). For Nvidia, decreasing the baseline CPU performance by using x87 instructions and a single thread makes GPUs look better.”
Unfortunately, this just isn’t surprising. PhysX is a major value-add for the GeForce range of cards and a lot of that value would be removed if a powerful processor could do the job just as well. Given the choice of maintaining the performance gap by leaving the code as it is, or recompiling the code and losing the advantage, it becomes clear which the graphics card manufacturer will favour. While this is a real shame for consumers, it makes perfect commercial sense.
For those who like their numbers or who want to get into the detailed analysis, the report is definitely worth a read. Just try not to get frustrated when you realise that your multi-core processor could probably rip through those physics calculations without breaking a sweat.