Nvidia Turing Architecture Examined And Explained

GeForce RTX 2080 Ti and RTX 2080 Speeds And Feeds

There's been bountiful talk about the Turing architecture and, at times, reference to the TU102 and TU104. These codenames refer to the specific silicon implementation that various GeForce RTX 20-series cards use.

GeForce RTX 2080 Ti

Here is that TU102 full-config block diagram again. As noted earlier but worth repeating, it's a biggie. The Nvidia GeForce RTX 2080 Ti is not a full implementation of this die, just as GeForce GTX 1080 Ti was not the full GP102. There are at least two reasons why this is the case. The first is that Nvidia typically reserves the full-fat die for the Titan-class cards - Titan Xt, perhaps, where it can charge a much higher premium. The second, more obvious answer is that it wants to reserve the full-on TU102 for the vastly more expensive Quadro RTX 6000.

Still, GeForce RTX 2080 Ti is certainly no shrinking violet. It uses 68 of the maximum 72 SMs, meaning 4,352 shaders. Also knowing that the Tensor cores, RT cores, geometry units, texture units and ROPs are all tied together from a ratio perspective, RTX 2080 Ti drops them to 544, 68, 34, 272, and 88, respectively. There's obviously also the associated diminution to total memory caches, register files, and crucially, memory-bus width, dropping from 384 bits to 352. At the same speed, RTX 2080 Ti is about 95 per cent of a full-on TU102. If those numbers give you a headache, it's easier to imagine RTX 2080 Ti as having a couple of those SM blocks deactivated.

GeForce RTX 2080

The RTX 2080, meanwhile, uses the TU104 die that we have also spoken about. In its complete form, shown above, it uses 13.6bn transistors, is built on the same 12nm process, and has 48 SMs that are identical to TU102. Of course, being a smaller die means there's less cache, only a 256-bit memory interface, and up to 8GB of GDDR6 operating at that same 14Gbps.

As you might have guessed, the RTX 2080 isn't the full implementation, either, as it carries 46SMs instead of 48. Though there is that same commensurate drop because the SMs carry the associated Tensor and RT cores, Nvidia keeps the full 256-bit memory bus, full 64 ROPs and the same 4MB of L2 cache as the full TU104, depicted above.

We're glad to see that Nvidia hasn't reduced the memory speed as it remains at 14Gbps and harnesses the same GDDR6 memory. Kind of gets confusing to know exactly what is going on because so many numbers are floating around, and we haven't even talked about frequencies yet, so let's jot them down into a table that also has the last-gen GeForce GTX 1080 Ti and GTX 1080.

	GeForce RTX 2080 Ti (FE)	GeForce GTX 1080 Ti	GeForce RTX 2080 (FE)	GeForce GTX 1080
Launch date	September 2018	March 2017	September 2018	May 2016
Codename	TU102	GP102	TU104	GP104
Architecture	Turing	Pascal	Turing	Pascal
Process (nm)	12	16	12	16
Transistors (bn)	18.6	12	13.6	7.2
Die Size (mm²)	754	471	545	314
Core Clock (MHz)	1,350	1,480	1,515	1,607
Boost Clock (MHz)	1,545/1,635	1,582	1,710/1,800	1,733
Shaders	4,352	3,584	2,944	2,560
GFLOPS	13,448/14,231	11,340	10,598	8,873
Tensor Cores	544	-	368	-
RT Cores	68	-	46	-
Memory Size	11GB	11GB	8GB	8GB
Memory Bus	352-bit	352-bit	256-bit	256-bit
Memory Type	GDDR6	GDDR5X	GDDR6	GDDR5X
Memory Clock	14Gbps	11Gbps	14Gbps	10Gbps
Memory Bandwidth	616	484	448	320
ROPs	88	88	64	64
Texture Units	272	224	184	160
L2 cache (KB)	5,632	2,816	4,096	2,048
Power Connector	8-pin + 8-pin	8-pin + 6-pin	8-pin + 6-pin	8-pin
TDP (watts)	250/260	250	215/225	180
Current MSRP	$999/$1,199	$699	$699/$799	$499

Specs Comparo

Nvidia has two sets of specifications for both new cards. There's the base spec that a number of add-in card partners will adhere to, then there are the Founders Edition cards that, for the first time, offer a higher peak boost clock - an extra 80MHz for the RTX 2080 Ti and 90MHz for the RTX 2080. Nvidia reckons this makes sense because it has improved the cooling on the FE design significantly.

If you have managed to get this far, well done, though you will appreciate that peak specs don't do the new RTX cards full justice. Having more shaders is one thing, but the spec table cannot accommodate performance improvements from, say, the refined SM unit, RT cores, Tensor cores, etc.

Even so, there's a fair bit more FP32 TFLOPS on the table when compared to their model-equivalent 10-series FE cards. RTX 2080 Ti FE, for example, has 25 per cent more pure shader power; RTX 2080 FE has around 20 per cent. Nominal bandwidth is up 27 per cent and 40 per cent, respectively, while power has increased to take into account the extra performance.

One would expect the RTX 2080 Ti to be in a performance league of its own even if the Tensor and RT cores are sat idle; it's simply a lot more powerful. We'd expect to be at least 50 per cent faster, on average, than the GeForce GTX 1080 Ti at 4K. The RTX 2080, meanwhile, ought to be 10-15 per cent faster than the Ti champ of the last generation.

Specs tell you one thing, real-world performance can be something else, and we'll know exactly how these two new cards fare in current games in the next few days.

Review: Nvidia Turing Architecture Examined And Explained

GeForce RTX 2080 Ti and RTX 2080 Speeds And Feeds

GeForce RTX 2080 Ti

GeForce RTX 2080

Specs Comparo

MY HEXUS

EVENTS

INDUSTRY PRESS RELEASES