It was last year that we first reported on details of ARM's upcoming Mali-T6xx architecture, which promised to support true GPGPU compute and offer significant performance increases over current offerings, however, almost a year later and we've not heard a single peep from this impressive new architecture, with even Samsung's latest and greatest Exynos 4 quad-core SoC still running a member of the aging ARM Mali-T400 family.
However, as a sign that the design and its drivers are maturing, ready for release, ARM has officially filed for OpenCL 1.1 Full Profile certification for its Mali-T604 GPU; this filing stands out from the crowd, as typically mobile devices implement a reduced profile OpenCL, requiring a level of code-porting, however, in-theory, OpenCL code written for a PC can be directly copied-and-pasted when writing a program for these new ARM GPUs. One of the primary enabling elements is support for IEEE compliant double-precision floating-point processing.
We already know that Samsung will be bundling the Mali-T604 with its Exynos 5 SoC, along with Cortex-A15 processor cores, however, we expected to see the Exynos 5 surface earlier this year, perhaps today's report is an indication that, much in the same way silicon vendors go through several chip revisions during development, that the ARM IP for the T6xx and A15 still had a few tweaks to be made before becoming production ready and, that this could be one of the reasons why we're yet to see these designs out in the wild.
For those wondering just why we might want GPGPU compute in our mobiles, it's worth realising that GPUs, whilst fairly stupid, can crunch through large data sets, performing numerical calculations with much greater efficiency and/or performance than a CPU. This means that when using augmented reality, sound recording and image editing apps, developers can either offer-up increased battery-life or improved performance when compared against a CPU-bound application. Likewise, silicon vendors can pull-off tricks such as integrating their image processors with the GPU to reduce cost and offer greater flexibility.