GeForce RTX: A Calculated Gamble
The Turing microarchitecture fully encapsulates Nvidia's thinking on how to design GPUs for the next two or three years, and it marks a significant shift into how to achieve the best image fidelity at the lowest computational cost.
Achieving this aim requires more than just a traditional shader-core enlargement, going by how Turing is constructed. There need to be new ways of bringing image-enhancement tools to the table: Turing represents these by employing Tensor and RT cores.
That said, Nvidia also knows that rasterisation is here to stay for a long while yet so improves upon last-gen Pascal shaders by removing the integer-processing burden that can slow down the floating-point pipe unnecessarily. This is why Turing features a separate, distinct datapath that lets the floating-point units keep at maximum throughput.
A move between generations is also opportune in evaluating bottlenecks for the types of instructions likely to be used in the near future. Expanding and accelerating caches whilst configuring them for easier programming is the low-hanging fruit that's picked away by Turing.
Nvidia keeps the back end relatively untouched but speeds everything up by using 14Gbps-rated GDDR6 run through the same-width paths as Pascal. Coupled with general optimisations and benefits from the aforementioned cache reorganisation, real-world bandwidth is up by 50 per cent.
Yet this shader-core refinement is to be expected across architectures otherwise Turing would be a scaled-up version of Pascal. What's less expected is how Nvidia repurposes datacentre-specific Tensor cores for potential gaming benefits. These cores, which are a first for a consumer card, are great at the heavy-lifting math required for the training and inference parts of deep learning. But it does make sense once you consider that image processing is right in a Tensor core's wheelhouse, and it's going to be really interesting to see how they can be used to improve image quality - DLSS, for example, appears almost too good to be true.
Then there's ray tracing, long-since held as the holy grail for lifelike imagery. Nvidia wants to use it in a hybrid way so the Turing RT cores are tasked with helping determine the absolutely right colour for each pixel. Though hardware accelerated for the first time, we'd imagine it will be used sparingly - for shadows and super-shiny surfaces - because even a cutting-edge GPU can only handle a few rays at a time.
The combination of beefed-up shader cores, Tensor cores and RT cores offer the potential for stunning results if used correctly - take a look at the Star Wars demo as proof - and that's the point of Turing, to elevate in-game, real-time-generated imagery to the next level whilst also maintaning decent frame rate.
There's a lot to like about Turing from a pure architecture point of view. The question then becomes one of how many games engines will take full advantage of what it has to offer. Nvidia is building that support ecosystem right now and we'd expect it will be months before the trio of CUDA, Tensor and RT cores is used properly. And they have to be for the various Turing GPUs to make sense, because as the Infiltrator demo shows, top-spec Turing can be twice as fast as GTX 1080 Ti - shader and Tensor cores combining well - whilst producing arguably better image quality.
It's a gamble, to be sure, because if swathes of the Turing silicon - deep learning and ray casting - is not used as Nvidia intends for the gaming market, or the changes in the SMs don't pan out as they appear to do so on paper, then these GeForce RTX cards are overspecified and needlessly costly to produce. DLSS and RTX compatibility has to come fast and hard.
Turing, then, feels like a good-enough architecture for today with room to evolve for tomorrow. It's better than Pascal in every way, clearly, yet just how good it will be is in the hands of games developers who need to fully leverage its monster, forward-looking specification.
The bottom line is that Nvidia has laid down the framework for its near-term vision of computer graphics, and it wants traditional rasterisation, deep learning and ray tracing to play equal parts in it. As always, Father Time - and upcoming performance analysis - will tell if this is the right way.