facebook rss twitter

NVIDIA Tesla K20 and K20X GPU accelerators officialy released

by Alistair Lowe on 13 November 2012, 10:45

Quick Link: HEXUS.net/qabo3n

Add to My Vault: x

We detailed NVIDIA's GK110 Kepler architecture last week and, only yesterday, revealed that the firm's mystery Tesla K20X was responsible for placing Cray's Titan supercomputer at the top of the supercomputing charts.

Today, NVIDIA has at last officially announced its two new GK110 Tesla cards, the Tesla K20X and Tesla K20 and, it's now possible to purchase systems based on these units from OEMs and resellers.

Features Tesla K20X Tesla K20 Tesla K10 Tesla M2090 Tesla M2075
Number and Type of GPU 1 Kepler GK110 2 Kepler GK104s 1 Fermi GPU 1 Fermi GPU
GPU Computing Applications Seismic processing, CFD, CAE, Financial computing, Computational chemistry and Physics, Data analytics, Satellite imaging, Weather modeling Seismic processing, signal and image processing, video analytics Seismic processing, CFD, CAE, Financial computing, Computational chemistry and Physics, Data analytics, Satellite imaging, Weather modeling
Peak double precision floating point performance 1.31 Tflops 1.17 Tflops 190 Gigaflops
(95 Gflops per GPU)
665 Gigaflops 515 Gigaflops
Peak single precision floating point performance 3.95 Tflops 3.52 Tflops 4577 Gigaflops
(2288 Gflops per GPU)
1331 Gigaflops 1030 Gigaflops
Memory bandwidth (ECC off) 250 GB/sec 208 GB/sec 320 GB/sec
(160 GB/sec per GPU)
177 GB/sec 150 GB/sec
Memory size (GDDR5) 6 GB 5 GB 8GB
(4 GB per GPU)
6 GigaBytes 6 GigaBytes
CUDA cores 2688 2496 3072
(1536 per GPU)
512 448

Even more of a beast than the Tesla K20, the K20X features an amazing 2,688 CUDA cores on a single die (we assume 16 SMX units), 6GB of RAM and an increased memory throughput. At 1.31 Teraflops of double-precision floating-point performance, the Kepler-based Tesla is twice as powerful in raw figures as its Fermi counterpart and even more so in reality, all whilst utilising less power.

Unlike the K10, the K20 range features a complete internal and external ECC pipeline, along with support for Dynamic Parallelism and Hyper-Q functionality, though, use of ECC does come at a cost of 12.5 per cent of memory capacity and, a little performance.

It's exciting times for GPGPU compute and, with the increasing presence of graphics cores in mainstream processors, we wonder at what point writing GPU code will become a common part of every programmer's daily routine.



HEXUS Forums :: 18 Comments

Login with Forum Account

Don't have an account? Register today!
the K20X features an amazing 2,688 CUDA cores on a single die (we assume 16 SMX units)

If so that would make 168 cores per SMX, giving K20 14.857… SMX ;) More likely the top end part is 14 SMX (== 192 cores per SMX), which gives K20 13 SMX. AFAIK the full die was meant to be 16 SMX, so these are heavily disabled parts. It does wrap up a lot of the conflicting specs we saw a couple of weeks ago though.

I strongly suspect that the K20X won't actually be seen outside Titan: if they're having to chop out 3 SMX to get a saleable K20, I suspect 18,866 good K20X cards is all they've been able to make ;)
hehe - We always read a lot more about Nvidia yield issues compared to others, so there's some truth in that, I wonder what the issues are an how they work with TSMC at resolving them?

I say that because in the SOC market place if they get too ‘clever’ with the GPU design on the replacement for Tegra3, I doubt there's room for yield issues…. and these issues seem to have started before Fermi and continued into Kepla.

No denying these GPGPU's are mighty and given the ‘kudos’ of being selected for the most powerful supercomputer there's even more to them than that…..2688cores ffs!!!

Because of this there are such high hopes for the GTX780, lets hope people won't be disappointed in one way or another.

I have to say, pricing on graphics cards is massively OTT right now, if these prices are maintained then it will slow down peoples upgrade path. I like shinies and I always enjoyed upgrading, but I don't game at 2560x1440 or have a multi-monitor set-up, so my gtx570 has more than enough life left in it… I can't justify the current prices for a minor speed bump.
Lets not get ahead of ourselves, this means very little for the consumer enthusiast market as far as I can tell as the yields would be so poor they will be mighty expensive and only sold in relatively small quantities, AMD is certainly the best bet for GPGPU in the consumer market as they are priced similarly to Nvida small die “gaming only” cards but offer similar performance and then in the gpgpu space they decimate Nvidia.

Nvidia have a long way to go in getting those chips down to acceptable yields to combat AMDs current strength.
Hicks12
Lets not get ahead of ourselves, this means very little for the consumer enthusiast market as far as I can tell as the yields would be so poor they will be mighty expensive and only sold in relatively small quantities, AMD is certainly the best bet for GPGPU in the consumer market as they are priced similarly to Nvida small die “gaming only” cards but offer similar performance and then in the gpgpu space they decimate Nvidia.

Nvidia have a long way to go in getting those chips down to acceptable yields to combat AMDs current strength.


Both the K20 and K20X have the core and the shaders running at between 700MHZ to 750MHZ. The consumer cards have the core and shaders running at over 900MHZ,with boost probably running them at anything between 1100MHZ to 1300MHZ,dependent on the model,conditions and production variations in the GPUs themselves.

The K20X does have 2688 shaders as opposed to the 1536 the GTX680 has,and has more memory bandwidth too. However,the 50% to 70% realworld increase in clockspeeds of the GTX680,negates the shader advantage IMHO. It is probably the improvement in memory bandwidth and the increase in ROPs, which probably would be the main reason for any performance increase over the GTX680.
scaryjim
If so that would make 168 cores per SMX, giving K20 14.857… SMX ;) More likely the top end part is 14 SMX (== 192 cores per SMX), which gives K20 13 SMX. AFAIK the full die was meant to be 16 SMX, so these are heavily disabled parts. It does wrap up a lot of the conflicting specs we saw a couple of weeks ago though.

I strongly suspect that the K20X won't actually be seen outside Titan: if they're having to chop out 3 SMX to get a saleable K20, I suspect 18,866 good K20X cards is all they've been able to make ;)

From AT:
Interestingly NVIDIA tells us that their yields are terrific – a statement backed up in their latest financial statement – so the problem NVIDIA is facing appears to be demand and allocation rather than manufacturing.

Of course yield is most likely into relation the 14 cluster downwards, but maybe that was there initial target anyway?

Worth a read:http://www.anandtech.com/show/6446/nvidia-launches-tesla-k20-k20x-gk110-arrives-at-last