GeForce RTX 2080 Ti and RTX 2080 Speeds And Feeds
There's been bountiful talk about the Turing architecture and, at times, reference to the TU102 and TU104. These codenames refer to the specific silicon implementation that various GeForce RTX 20-series cards use.
GeForce RTX 2080 Ti
Here is that TU102 full-config block diagram again. As noted earlier but worth repeating, it's a biggie. The Nvidia GeForce RTX 2080 Ti is not a full implementation of this die, just as GeForce GTX 1080 Ti was not the full GP102. There are at least two reasons why this is the case. The first is that Nvidia typically reserves the full-fat die for the Titan-class cards - Titan Xt, perhaps, where it can charge a much higher premium. The second, more obvious answer is that it wants to reserve the full-on TU102 for the vastly more expensive Quadro RTX 6000.
Still, GeForce RTX 2080 Ti is certainly no shrinking violet. It uses 68 of the maximum 72 SMs, meaning 4,352 shaders. Also knowing that the Tensor cores, RT cores, geometry units, texture units and ROPs are all tied together from a ratio perspective, RTX 2080 Ti drops them to 544, 68, 34, 272, and 88, respectively. There's obviously also the associated diminution to total memory caches, register files, and crucially, memory-bus width, dropping from 384 bits to 352. At the same speed, RTX 2080 Ti is about 95 per cent of a full-on TU102. If those numbers give you a headache, it's easier to imagine RTX 2080 Ti as having a couple of those SM blocks deactivated.
GeForce RTX 2080
The RTX 2080, meanwhile, uses the TU104 die that we have also spoken about. In its complete form, shown above, it uses 13.6bn transistors, is built on the same 12nm process, and has 48 SMs that are identical to TU102. Of course, being a smaller die means there's less cache, only a 256-bit memory interface, and up to 8GB of GDDR6 operating at that same 14Gbps.
As you might have guessed, the RTX 2080 isn't the full implementation, either, as it carries 46SMs instead of 48. Though there is that same commensurate drop because the SMs carry the associated Tensor and RT cores, Nvidia keeps the full 256-bit memory bus, full 64 ROPs and the same 4MB of L2 cache as the full TU104, depicted above.
We're glad to see that Nvidia hasn't reduced the memory speed as it remains at 14Gbps and harnesses the same GDDR6 memory. Kind of gets confusing to know exactly what is going on because so many numbers are floating around, and we haven't even talked about frequencies yet, so let's jot them down into a table that also has the last-gen GeForce GTX 1080 Ti and GTX 1080.
GeForce RTX 2080 Ti (FE) |
GeForce GTX 1080 Ti |
GeForce RTX 2080 (FE) |
GeForce GTX 1080 |
|
---|---|---|---|---|
Launch date | September 2018 |
March 2017 |
September 2018 |
May 2016 |
Codename | TU102 |
GP102 |
TU104 |
GP104 |
Architecture | Turing |
Pascal |
Turing |
Pascal |
Process (nm) | 12 |
16 |
12 |
16 |
Transistors (bn) | 18.6 |
12 |
13.6 |
7.2 |
Die Size (mm²) | 754 |
471 |
545
|
314 |
Core Clock (MHz) | 1,350 |
1,480 |
1,515 |
1,607 |
Boost Clock (MHz) | 1,545/1,635 |
1,582 |
1,710/1,800 |
1,733 |
Shaders | 4,352 |
3,584 |
2,944 |
2,560 |
GFLOPS | 13,448/14,231 |
11,340 |
10,598 |
8,873 |
Tensor Cores | 544 |
- |
368 |
- |
RT Cores | 68 |
- |
46 |
- |
Memory Size | 11GB |
11GB |
8GB |
8GB |
Memory Bus | 352-bit |
352-bit |
256-bit |
256-bit |
Memory Type | GDDR6 |
GDDR5X |
GDDR6 |
GDDR5X |
Memory Clock | 14Gbps |
11Gbps |
14Gbps |
10Gbps |
Memory Bandwidth | 616 |
484 |
448 |
320 |
ROPs | 88 |
88 |
64 |
64 |
Texture Units | 272 |
224 |
184 |
160 |
L2 cache (KB) | 5,632 |
2,816 |
4,096 |
2,048 |
Power Connector | 8-pin + 8-pin |
8-pin + 6-pin |
8-pin + 6-pin |
8-pin |
TDP (watts) | 250/260 |
250 |
215/225 |
180 |
Current MSRP | $999/$1,199 |
$699 |
$699/$799 |
$499 |
Specs Comparo
Nvidia has two sets of specifications for both new cards. There's the base spec that a number of add-in card partners will adhere to, then there are the Founders Edition cards that, for the first time, offer a higher peak boost clock - an extra 80MHz for the RTX 2080 Ti and 90MHz for the RTX 2080. Nvidia reckons this makes sense because it has improved the cooling on the FE design significantly.
If you have managed to get this far, well done, though you will appreciate that peak specs don't do the new RTX cards full justice. Having more shaders is one thing, but the spec table cannot accommodate performance improvements from, say, the refined SM unit, RT cores, Tensor cores, etc.
Even so, there's a fair bit more FP32 TFLOPS on the table when compared to their model-equivalent 10-series FE cards. RTX 2080 Ti FE, for example, has 25 per cent more pure shader power; RTX 2080 FE has around 20 per cent. Nominal bandwidth is up 27 per cent and 40 per cent, respectively, while power has increased to take into account the extra performance.
One would expect the RTX 2080 Ti to be in a performance league of its own even if the Tensor and RT cores are sat idle; it's simply a lot more powerful. We'd expect to be at least 50 per cent faster, on average, than the GeForce GTX 1080 Ti at 4K. The RTX 2080, meanwhile, ought to be 10-15 per cent faster than the Ti champ of the last generation.
Specs tell you one thing, real-world performance can be something else, and we'll know exactly how these two new cards fare in current games in the next few days.