GCN to the APU
The larger update for Kaveri comes by way of improved graphics that use the GCN architecture. The move brings consistency through AMD's GPU product stack. Kaveri harnesses the same GCN architecture as the Hawaii-based Radeon HD 290(X) GPUs.
The top-line APU's graphics can legitimately be thought of as one-quarter of those found in the muscular 290X part - Kaveri uses eight Compute Units (CU) that each are home to 64 shaders. Examine the composition in detail shows that each CU carries four 16-wide SIMD vector blocks, plus a scalar unit and registers in the middle, and a general, per-core scheduler at the top.
A total of 512 shaders and use of GCN architecture offers peak graphics performance that is up to 25 per cent faster than on the Richland APU, according to AMD, though the average frame-rate benefit is likely to be closer to 15 per cent, even with the slower GPU clock of 720MHz in mind. Looking outside the GPU block, Kaveri bakes in AMD's TrueAudio technology and the latest UVD block, as well.
But there is a key addition for Kaveri over discrete GPUs using the Hawaii architecture: shared coherent unified memory. Memory coherency is a means by which to keep a shared memory pool updated when more than one processor is working from it, to ensure that the data remains current when changes are made that would otherwise be oblivious to other cores. Kaveri's trick is in having full memory coherency between the GPU and CPU cores for the first time on a APU. This leads us nicely on to Heterogeneous System Architecture (HSA).
HSA - tying it all together
Changes made to the last-generation Trinity architecture provided a means by which the CPU and GPU cores could work far more closely than with Llano. AMD's engineers added an input/output memory management unit (IOMMU) that enabled the GPU portion of the chip to access the virtual address space, laying the foundation for sharing virtual addresses with the CPU.
Kaveri takes this sharing a step or two further by adding a second bus between the IOMMU and GPU, thereby offering the above-mentioned coherency between the CPU and GPU, and secondly, adding a feature called system-level atomics, whose job it is to synchronise the workloads between all the cores, be they CPU or GPU.
Memory coherency and synchronisation - the two buzzwords for HSA compliance - are needed to make the GPU portion of the APU an equal partner to the CPU. For example, under HSA, the GPU doesn't need to have data copied before accessing it, can now access the same address space, and can take a peek into what the CPU is working at.
The main point is to give the GPU the same level of overall system access previously enjoyed solely by the GPU - HSA rights an historical wrong that has previously relegated it to an also-ran device in terms of programming. Really, the GPU is arguably more important than the CPU in a modern APU. In sum, HSA brings a cleaner way of apportioning workload to either the CPU or GPU through easier programmability, and this is perhaps why AMD refers to both as compute cores - Kaveri has up to four CPU and eight GPU CUs.
Exploiting the efficiencies of HSA requires that software be aware of the easier-to-access parallelism, meaning that existing code needs to be reworked; AMD says that the HSA Consortium is working with many industry-standard languages to enable HSA acceleration. Java support, for example, is due in 2015.
The pragmatism - models, socket, pricing, performance, etc.
Kaveri APUs use the FM2+ socket that is widely available on motherboards from all major manufacturers today. At launch, only two models will be present, the A10-7850K and A10-7700K, priced at $173 and $152, respectively. An A8-7600 will follow later.
Note that only the top-line part has the full complement of graphics CUs, as the A10-7700K drops the total number of shaders from 512 to 384. The lesser A10 is also slower with respect to CPU speed but still shares the same 95W TDP. We'd recommend most users ponying up the extra $21 and going for the faster part. Purchase either A10 and AMD includes a copy of Battlefield 4, downloadable via a free coupon code.
Those looking for smaller-form-factor systems would do well to wait for the A8-7600. Announced today and priced at $119 though shipping later, the combination of significantly lower TDP and decent specifications suggest it may be the best part to go for if building into a Mini ITX system.
So how does the best Kaveri APU perform against its peers? We'd love to tell you right away with the usual barrage of benchmarks and commentary. We only received our sample this morning and will be putting it through the usual paces in the coming days. Without any empirical data to backup such assertions, the expectation is that it will be faster than incumbent Richland on both the CPU and GPU, and good enough to beat price-comparable Intel Haswell chips on both fronts, too. Until then, AMD's own benchmarks suggest that, rather obviously, Kaveri does well with modern tasks, especially if they're tuned for HSA.
Compared to the previous generation of APUs, AMD Kaveri has revised CPU cores and leveraged the GPU technology found on the latest Radeons. Weaving these two processing powerhouses together with HSA and thus offering simpler, more efficient programming for future applications, Kaveri is more of a step forward than the first batch of benchmarks may suggest; it lays down the blueprint for future APU designs.