facebook rss twitter

Review: BFG (NVIDIA) GeForce GTX 280: does it rock our world?

by Tarinder Sandhu on 16 June 2008, 14:01

Tags: GeForce GTX 280, NVIDIA (NASDAQ:NVDA), BFG Technologies

Quick Link: HEXUS.net/qanni

Add to My Vault: x

Tell me how it stacks up

Trotting out the comparison table

Graphics cards NVIDIA GeForce GTX 280 1024 NVIDIA GeForce GTX 260 896 NVIDIA GeForce 9800 GX2 1024 NVIDIA GeForce 9800 GTX 512 NVIDIA GeForce 8800 GTS 512 NVIDIA GeForce 8800 Ultra 768 ATI Radeon HD 3870 X2 1024 ATI Radeon HD 3870 512
PCIe PCIe 2.0 PCIe 1.x PCIe 2.0
GPU clock 602MHz 576MHz 600MHz 675MHz 650MHz 612MHz 825MHz 775MHz
Shader clock 1,296MHz 1,242MHz 1,500MHz 1,688MHz 1,625MHz 1,500MHz 825MHz 775MHz
Memory clock (effective) 2,214MHz 1,998MHz 2,000MHz 2,200MHz 1,940MHz 2,160MHz 1,802MHz 2,250MHz
Memory interface, size, and implementation 512-bit, 1,024MiB, GDDR3 448-bit, 896MiB, GDDR3 2x 256-bit, 1,024MiB, GDDR3 256-bit, 512MiB, GDDR3 384-bit, 768MiB, GDDR3 2x 256-bit, 1,024MiB, GDDR3 256-bit, 512MiB, GDDR4
Memory bandwidth 141.7GiB/sec 111.90GiB/sec 128GiB/sec (card) 70.40GiB/sec 62.1GiB/sec 103.68GiB/sec 115.33GiB/sec (card) 72.8GiB/sec
Manufacturing process TSMC, 65nm TSMC, 90nm TSMC, 55nm
Transistor count 1,408M 1,408M 1,508M 754M 681M 1,300M 666M
Die size Unknown (big) Unknown (big) 2x 296mm² 330mm² 484mm² 2x 192mm² 192mm²
DirectX Shader Model DX10, 4.0 DX10.1, 4.1
Vertex, fragment, geometry shading (shared) 240 FP32 scalar ALUs, MADD dual-issue (unified) 192 FP32 scalar ALUs, MADD dual-issue (unified) 256 FP32 scalar ALUs, MADD dual-issue (unified) 128 FP32 scalar ALUs, MADD dual-issue (unified) 128 FP32 scalar ALUs, MADD dual-issue (unified) 640 FP32 scalar ALUs, MADD dual-issue (unified) 320 FP32 scalar ALUs, MADD dual-issue (unified)
Peak GFLOP/s 933* 715* 768/1152* 432/648* 416/624* 384/576* 1,056 496
Data sampling and filtering 80ppc address and 80ppc bilinear (8-bit integer)/40ppc FP16
filtering, max 16xAF
64ppc address and 64ppc bilinear (8-bit integer)/32ppc FP16 filtering, max 16xAF 128ppc address and 128ppc bilinear INT8/64ppc FP16 filtering, max 16xAF 64ppc address and 64ppc bilinear INT8/32ppc FP16 filtering, max 16xAF 64ppc address and 64ppc bilinear INT8/32ppc FP16 filtering, max 16xAF 32ppc address and 32ppc bilinear INT8/32ppc FP16 filtering, max 16xAF 32ppc address and 32ppc bilinear INT8/FP16 filtering, max 16xAF 16ppc address and 16ppc bilinear INT8/FP16 filtering, max 16xAF
Peak fillrate Gpixels/s 19.264 16.128 19.2 10.8 10.4 14.688 26.4 12.4
Peak Gtexel/s (bilinear) 48.16 36.864 76.8 43.2 41.6 19.584 26.4 12.4
Peak Gtexel/s (FP16, bilinear) 24.09 18.432 38.4 21.6 20.8 19.584 26.4 12.4
ROPs 32 28 32 16 16 24 32 16
Peak TDP (claimed) 236 182 196 156 140 175 196 110
Power connectors (default clock) 8-pin + 6-pin 6-pin + 6-pin 8-pin + 6-pin 6-pin + 6-pin 6-pin 6-pin + 6-pin 8-pin + 6-pin 6-pin
Multi-GPU SLI - three-board SLI - three-board SLI - two-board SLI - three-board SLI - two-board SLI - three-board CrossFire - two-board CrossFire - four-board
Outputs 2 x dual-link DVI w/HDCP, HDMI, mini-DIN 2 x dual-link DVI w/HDCP, mini-DIN 2 x dual-link DVI w/HDCP, HDMI 2 x dual-link DVI w/HDCP, HDMI (native, on GPU) 2 x dual-link DVI w/HDCP, mini-DIN 2 x dual-link DVI w/HDCP (discrete ASIC), mini-DIN 2 x dual-link DVI (HDMI) w/HDCP, mini-DIN (VIVO)
Hardware-assisted video-decoding engine NVIDIA's PureVideo HD - full H.264 decode and partial VC-1 decode NVIDIA PureVideo HD 1st gen AMD UVD - full H.264 and VC-1 decode
Reference cooler dual-slot dual-slot dual-slot dual-slot dual-slot dual-slot dual-slot dual-slot
Retail price (default-clocked model) £449 £299** £299 £185 £145 £299 (hard to find) £239 £89


* calculated on a three-FLOPS basis. Other GeForces are shown on two- and three-FLOPs throughput.
** predicted pricing

GeForce GTX 260: you've not told me about that yet

The 'budget' next-generation offering is the GTX 260. Based on GTX 280 (duh!), it's a case of tried-and-trusted snipping of various counts. The table shows that it's clocked in lower on all fronts, has fewer stream processors - 192 versus 240 - with eight clusters versus 10. It also has one ROP partition removed, resulting in seven block of four, or 28 in total.

Knowing the ROP lop-off, GTX 260 has a seven-channel memory interface, made up of 64-bits. Add it together and you have a 448-bit interface with 2GHz-rated memory. Calculator-time tells you that's potential bandwidth in the region of 112GiB/s (448/8 x 1,998).

The financial implication is that it will be cheaper, by around £150, we reckon.

The meat on the bones

The GPU clocks of 602/576MHz tie in with what NVIDIA has pushed out in the last 18 months. Knowing that the GTX 280 can bilinear-filter 80 ppc and bilinear FP16 (16-bit floating point) filter 40ppc, the peak Gpixel/s throughput is higher than any single GPU that has gone before.

However, the twin-GPU GeForce 9800 GX2 handsomely beats out GTX 280 in both bilinear INT8 and FP16 throughput. Similarly, the twin-GPU Radeon HD 3870 X2 FP16 throughput is higher, too.

Even so, a healthy dose of ROPs also keeps the GTX 280's fillrate the highest of any single GPU.

On the down side, the shader clock reduced from the 1,500MHz+ that we're accustomed to seeing of late. Assuming that we count the design as a three-FLOPS issue, per clock, the peak GFLOPS rate is also good, helped, no doubt, by the voluminous processors at hand. There's plenty of shading horsepower that's allied to impressive fillrate and acceptable 16-bit floating-point bilinear filtering, then.

GTX 280 uses a 512-bit memory interface that's paired with high-speed GDDR3 RAM operating at an effective 2,214MHz. Adding it up, GTX 280 offers over 140GiB of juicy bandwidth. As we noted above, GTX 260's bandwidth is reduced on two fronts: interface size and DRAM speed. This results in around 112GiB/s. And both are higher than on any single GPU released in volume.

Power and form

The big, meaty GPUs in the GTX 200-series are based on a 65nm manufacturing process, unchanged from the newer 8-series and all 9-series GPUs.

That means heat, and lots of it. We've alluded to a maximum TDP of 236W for the range-topping model. The GTX 260 fares a little better, drawing up-to 182W - but even that is a high number.

We're getting into the realms where air cooling simply won't cut it. A smaller manufacturing process reduces the TDP at the heat-related expense of a smaller die, leading to the need to shift significant wattage per square mm. Obligatory liquid-based cooling isn't that far off, we reckon.

NVIDIA has designed a revised dual-slot cooler to stop the beasties getting too warm but we'd have preferred to have seen the GPUs on the half-node 55nm process. Doubtless that will come later in the year.

Outputs

Nothing much has changed here, either. Unlike ATI with its 3000-series GPUs, there's no native HDMI or DisplayPort provision, and both will be added, by AIBs, via separate ASICs. The video-processing engine remains the same as 9-series, too.

Price and launch SKUs

All the architecture talk in the world can be rendered irrelevant by pricing. At time of writing, UK pricing for default-clocked cards gravitated towards £449 (ouch!), while US pricing is around $649 and Euro pricing €499.

As demand tends to outstrip supply in the first week of a new architecture's release, we expect the pricing to drop around 10 per cent within a month. Still, it's hugely expensive for a card with a single GPU.

Looking at the immediate past, ATI has forsaken performance leadership and concentrated on the value perspective. NVIDIA's big and hot GPU almost needs to be expensive, to counter the die cost and engineering that's gone into it.

What NVIDIA will be aiming to do, and most likely will do, is derive lower-cost SKUs by whipping out the old architecture-busting hammer - and get them to retail quickly before ATI is able to respond with its next-generation mid-range part, RV770.

Basic summary

The specifications tell us that GeForce GTX 280 will be the fastest single-GPU card around: no question, really. It has more shading, fillrate, multitexturing,and memory bandwidth than any other GPUs shown in the table, above.

What's interesting is that it may not be faster than the fudged-together GeForce 9800 GX2 in certain circumstances that rely on heavy texturing and that card is, in effect, based on a couple of GPUs that were available around 18 months ago.

NVIDIA is keen to push the GPGPU or parallel computing architecture facets of the GTX 280/260, but don't let that fool you; all GeForce 8-series and 9-series cards can run CUDA and accelerate non-gaming tasks.

Bigger, faster, wider, more power-hungry, NVIDIA's taken the brute strength approach with the newest iteration of GPUs. We had hoped for something a little more elegant, however.

On to the tootin', rootin' card now.