3D rendering performance
CINEBENCH R9.5, despite being able to use a full 16 threads, scales poorly with additional cores.
Even the Kentsfield - with its four cores, relatively low memory latency and higher theoretical memory bandwidth-per-core - only receives a 3.12x performance increase from going from one to four threads or 0.78x per core.
The 16-core Tigerton systems struggle to even reach a 0.4x improvement per core, the slower L7345 just achieving this. The top-speed X7350 comes in at just 0.38x per core.
While this may appear to be a bottleneck somewhere in the system the real cause may be in the software.
In order to break down the workload into separate threads, CINEBENCH divides the scene vertically into section, each one assigned to an individual CPU core. As a relatively fast section is completed another incomplete section has its remaining work split in half, with a new thread being spawned to work on this. This continues until the image is completed.
However, these threads appear to have some setup time penalty, not starting instantly when a previous section has been completed. Additionally, if a section only has a few lines to complete then a new thread will not be started.
These two factors combine to see the systems' CPUs often not at 100% load throughout the test - even on the four-core Kentsfield system.
Because of this, the eight-core Clovertown system is able to out-pace the sixteen-core Tigerton L7345 processors, despite the L7345's boasting more raw grunt according to Sandra. CINEBENCH R9.5 simply cannot use the sixteen cores as efficiently it can the Clovertown's eight.
As with its predecessor, CINEBENCH R10 does not scale linearly. Moving from eight to 16 cores yielding a 50% increase in performance from a near 100% increase in processing power.
Division of the workload is performed in the same way as in CINEBENCH R9.5 but the additional complexity in R10's scene ensures that the various sections all take longer to complete and, therefore, the CPUs spend more time at full load.
Even so, as with R9.5, the 3GHz Clovertown system is again able to outpace the 1.86GHz Tigertons.
POV-Ray represents a multi-threaded best-case scenario. The scene is split up into small tiles, allowing for maximum CPU utilisation throughout the test, while minimal interaction between the threads means that the Kentsfield is able to produce a 3.95x improvement - a near perfect 0.99x increase in performance from each core.
The Clovertown isn't far behind, managing 0.91x per core, for a total speed increase of 7.31x from eight cores.
The Tigertons produce their best results as well. The L7345 achieves a 13.38x performance increase - allowing it to beat the Clovertown by a reasonable margin. The X7350 sees this drop to 12.04x - 0.75x per core - hinting that, maybe, performance is being limited by memory or another sub-system on this high-speed part.