Maxwell The Tease
Nvidia is today launching two high-end desktop graphics cards known as the GeForce GTX 980 and GeForce GTX 970. The duo are not the first to be based on the latest Maxwell architecture - that honour goes to the GTX 750 Ti and select mobile GPUs - but they do, for now, represent the best that the energy-efficient architecture has to offer.
It makes sense to give you some relevant background information at the outset, helping frame the new Maxwell GPUs in context. Nvidia's codenames provide more than just reference for GeForce cards. The designation is telling insofar as it describes how the company views retail cards based upon it. For example, the x10 GPUs are the absolute best implementation of that particular architecture, while the x04 are released first, to test out the waters, and are tuned for a smaller, less expensive die size and lower energy consumption.
Remember back to the GK104-based GeForce GTX 680? This first performance Kepler GPU was designed to fend off the AMD Radeon threat without having to extend the architecture. Undeniably competent for its time, the real enthusiast Kepler card, GTX 780 (Ti) (GK110), rolled into town 18 months later. Swinging twice the silicon as the GK104 GPU and cranking all the dials to 10, it remains, arguably, the best sub-£500 consumer graphics card today.
This brief analysis of Nvidia's graphics positioning sheds light on the GM204 GPU powering both new cards. The strategy here is much like the GeForce GTX 680's, where the enthusiast is teased with a new architecture whose potential has yet to be fulfilled. The bigger, faster, meatier GM210 is being kept under wraps for a while yet, and there are at least two reasons for that.
Though no-one will go on record as saying this, our time with Nvidia last week indicates the green team feels under no pressure to release a faster, bigger-die GPU right now. AMD's Radeon R9 290(X) is taken care of by the GeForce GTX 780 (Ti), the Radeon R9 295X2 is considered too niche, and there isn't any brand-new architecture in the offing from AMD in the near future. That's Nvidia's closed-door thinking, anyway.
The second reason centres on fabrication process. Nvidia and AMD fully expected to be on a 20nm node for high-end GPUs by Q3 2014. Process difficulties at manufacturing partner TSMC and Apple's landgrab of the 20nm process for the innards powering the new iPhone have stalled the move for GPUs. Nvidia would rather debut the GM210 on a smaller process, thus saving valuable die space, so we'll likely see it available as soon as TSMC can provide enough wafers.
In view of this, Nvidia's obvious move is to launch the GM204 first, tapping into the efficiencies of the architecture to produce class-matching performance with, crucially, a GPU that's much cheaper to produce than the GK110-powered GeForce GTX 780 Ti.
Longer story short, GM204, aka GeForce GTX 980 and GTX 970, isn't going to provide a quantum leap in performance over existing high-end cards, for the reasons provided above. It should, however, offer solid performance with exceptional energy efficiency.
Let's keep up with tradition by rolling out the Table of Doom™ as a high-level means of explaining how the two new GeForces stack up against the establishment.
Nvidia GeForce GTX 980 (4GB) |
Nvidia GeForce GTX 970 (4GB) |
Nvidia GeForce GTX 780 (3GB) |
Nvidia GeForce GTX 680 (2GB) |
AMD Radeon R9 290X (4GB) |
AMD Radeon R9 290 (4GB) |
|
Launch date | September 2014 |
September 2014 |
May 2013 |
March 2012 |
October 2013 |
November 2013 |
Codename | GM204 |
GM204 |
GK110 |
GK104 |
Tahiti |
Tahiti |
DX API | 11.2 |
11.2 |
11.2 |
11.2 |
11.2 |
11.2 |
Process (nm) | 28 |
28 |
28 |
28 |
28 |
28 |
Transistors (mn) | 5,200 |
5,200 |
7,100 |
3,540 |
6,200 |
6,200 |
Approx Die Size (mm²) | 398 |
398 |
551 |
294 |
438 |
438 |
Full implementation of die | Yes |
No |
No |
Yes |
Yes |
No |
SM Units | 16 |
13 |
13 |
8 |
NA |
NA |
Processors | 2,048 |
1,664 |
2,304 |
1,536 |
2,816 |
2,560 |
Texture Units | 128 |
104 |
192 |
128 |
176 |
160 |
ROP Units | 64 |
64 |
48 |
32 |
64 |
64 |
Peak GPU Clock/Boost (MHz) | 1,216 |
1,178 |
900 |
1,058 |
1,000 |
947 |
Peak GFLOPS (SP) | 4,981 |
3,920 |
4,147 |
3,250 |
5,632 |
4,849 |
Peak GFLOPS (DP) | 156 |
122 |
173 |
135 |
704 |
606 |
Memory Clock (MHz) | 7,012 |
7,012 |
6,008 |
6,008 |
5,000 |
5,000 |
Memory Bus (bits) | 256 |
256 |
384 |
256 |
512 |
512 |
Max bandwidth (GB/s) | 224 |
224 |
288 |
192 |
320 |
320 |
Power Connectors | 6+6-pin |
6+6-pin |
8+6-pin |
6+6-pin |
8+6-pin |
8+6-pin |
TDP (watts) | 165 |
145 |
250 |
195 |
250 |
250 |
GFLOPS per watt | 30.19 |
27.03 |
16.59 |
16.66 |
22.52 |
19.40 |
Current price (Newegg) | $549 |
$329 |
$420 |
NA |
$460 |
$370 |
GeForce GTX 980: 1+1=3?
The headline GPU is the GeForce GTX 980, and it is based on a full implementation of the GM204 die. Manufactured on the same 28nm process available to both AMD and Nvidia for the last three years, the GTX 980 GPU is 28 per cent and nine per cent smaller than the GTX 780 and Radeon R9 290X, respectively. Generally speaking, smaller dies usually pave the way for cheaper cards.
Compared to the GTX 780, the namesake, the new GPU packs in 11 per cent fewer shading cores, 33 per cent fewer texturing units, a 50 per cent narrower memory bus and reduces power consumption by 34 per cent. And this GTX 980 is supposed to be quicker than GTX 780 by around 25 per cent, if Nvidia's internal numbers are to be believed. What's going on, you may ask, because the figures simply don't add up? Maxwell is a GPU where 1+1 doesn't always equal 2.
The full GM204 implementation
High-level Overview
Before we show you how Maxwell does a whole lot more with fewer on-paper resources - a very fine trick - let's take a look at the 10,000-foot overview. This is GTX 980 in all its pomp. Four graphics processing clusters (GPC) look remarkably similar to those found on Kepler-based GPUs, but the devil is in the details.
The first key differentiator between Maxwell and Kepler is the redesign of the GPC. Maxwell's GPC, in GM204 form, houses four SMM units instead of Kepler's three, though the shader/ALU count is similar as Maxwell drops the number of cores per SMM down from 192 to 128. Crunch the basic numbers and Maxwell's GPC has 512 cores while Kepler can harness 576. If you've failed basic math and need a calculator, four GPCs with 512 cores apiece combine to give the GTX 980 its 2,048 processors/shaders.
The question then arises of why Nvidia made this change. It's all about improving per-SMM efficiency. You see, the Kepler architecture was introduced well over two years ago with the arrival of the GeForce GTX 680. The intervening time has provided Nvidia with a number of methods for improving the design from a utilisation point of view - Maxwell, in a nutshell, is a major architecture overhaul of the Kepler base blueprint.