Earlier this week, we highlighted the performance gains possible on AMD's Radeon HD 7xxx series of graphics cards, with the release of entirely GCN-focused drivers. It's perhaps only fair then, that we take a quick look at NVIDIA and its Kepler-optimised CUDA 5 general-purpose compute library.
Often, when we hear of CUDA and and GPGPU compute, we think, this has nothing to do with games, why should we care? The truth is, CUDA and compute on the GPU in general is a fast-growing market. There's clear potential for accelerating scientific simulations, media encode/decode but also, games, where sometimes a custom algorithm is needed to provide a new visual effect or simulation of weather systems, which require massive parallelism. It's highly expected that all next-gen AAA game engines will utilise this form of acceleration in one way or another.
There are now over 375 million CUDA-capable GPUs in the world and, with each new architecture, NVIDIA introduces optimisations. CUDA 5 looks to bring-out the best in Kepler, offering:
- Dynamic Parallelism
- GPU Object-linking
- All-in-one Eclipse Nsight Develop, Debug and optimise plug-in
- GPUDirect
Most applicable, perhaps, to the gaming market are the top two new features. Dynamic Parallelism enables the GPU to generate more or less tasks in parallel as resources are available and enables GPU code to decide for itself the parameters of newly launched tasks, saving the often lengthy process of returning results to the CPU and waiting for a response, or simply launching a fixed task, that may not be suitable, based on previous results.
From a gaming perspective, this could help establish smoother frame-rates and maximise GPU usage. GPU Object-linking is a big one for development houses as well. Developers can now segment GPU programming tasks and have code developed in parallel by multiple coders, to be stitched together simply by providing the finished object file.
Though there's not currently a huge usage potential for GPUDirect in gaming, this is by far CUDA 5's most impressive feature. GPUDirect enables GPUs to perform direct memory transfers, not only to other GPUs sharing the same PCI-E bus but, also to other devices. Place a DMA-capable network card on the bus and suddenly you have direct memory transfers from one GPU to any other PCI-E device on a network, without any CPU intervention.
These new features excite us and, we expect to see some serious practical usage of GPGPU compute in gaming next year with the release of new high-end consoles. By no means is the GPU market showing signs of slowing.