Worth the Wait: NVIDIA's Kepler GTX Geforce 680 is New Graphics Market King

By sophiesummers on 5:13 PM

Filed Under:

New GPU is more powerful, but also quieter, cooler; beats AMD's similar offering in price

In January, Advanced Micro Devices Inc. (AMD) shipped the world's first 28 nm graphics processing unit, Tahiti.  Leveraging AMD's long-awaited new architecture, Graphics Core Next, the ensuing Radeon HD 7950/70 card snatched the performance crown away from rival NVIDIA Corp. (NVDA).

In the months that follow AMD fleshed out its lineup with four more cards, the Radeon HD 7750/70 and the Radeon HD 7850/70  While pricing was a bit high, in all but the Radeon HD 7700 series the AMD card was the best buy because NVIDIA's 28 nm counterpunch Kepler was missing in action.


I. The New Gaming King
Missing in action, that is, until now.  A week after a Kepler-powered ultrabook popped up, NVIDIA has pulled the wraps off of its flagship desktop Kepler graphics card, the GeForce GTX 680.

Almost everything in the GK104 architecture chip has been improved.  The die is a petite 294 mm2, with 3.5b transistors onboard, versus AMD's 365 mm2 4.3b transistor Tahiti.  Likewise, NVIDIA not only one-ups AMD in core clock speed (1008 MHz on the GTX 680 vs. 925 MHz on the Radeon HD 7970), but it also installs a promising new dynamic clocking system, which allows smartphone-esque throttling up or down, based on performance demands.  

In "unlocked" card models, NVIDIA expects the card to dip as a low as 325 MHz at idle allowing massive power savings.  On the opposite end of the spectrum, in times of extremely demanding performance, unlocked cards can dynamically clock up over the 1.1 GHz barrier, all automatically.

NVIDIA's frame buffer (memory) is a bit smaller -- 2 GB of GDDR5 vs. 3 GB of GDDR5 in the Radeon HD 7970, and the bus is narrower -- 256-bit vs. 384-bit.  Despite NVIDIA holding a slight edge in memory clock (6.008 GHz v. 5.5 GHz), memory throughput will like favor AMD.

Gaming-wise AnandTech's testing shows it to be faster in almost all games, though the AMD flagship manages to eke out a win in some tests.  In power and heat NVIDIA has dramatically improved over the 500 series, but it only earns a tie with AMD.  However, it is much quieter than AMD's cards.


II. GPU Computing -- Some Steps Forward, Some Spinning of the Wheels
The new card mostly impresses when it comes to GPU computing.

The card streamlines the Fermi architecture, eliminating the high performance, but divergent higher shader clock.  In its place it uses the core clock ubiquitously in all its computing functional units.  As a result, most of the components of its functional units doubled -- such as the number of CUDA cores, load/store units, and special function units.  For example, the CUDA core count in a block within a functional unit doubles from 32 to 64 16 to 32.  As a result, NVIDIA is able to keep pace on a functional unit level even while eliminating its higher performance shader clock.

To move things forward, NVIDIA then doubles the number of "blocks" of cores from 3 to 6 per functional unit, effectively doubling performance.  In total 192 CUDA cores (6 blocks of 32) now lurk inside a GK104 streaming multiprocessor (SM), vs 96 48 per SM (3 blocks of 16 cores) in the previous generation architecture.

SMs are grouped in blocks called GPCs.  There's twice as many GPCs (4) as Fermi (2), but they each half half the number of SMs (2 vs 4 in Fermi), so the SM count stays the same.

A couple remaining oddities are that it declines to boost the shared memory space from 64 kB (a disappointment considering 192 cores are now sharing the resources previously shared by 96 cores).  Also it offers 8 special CUDA cores per function unit that offer full 1/1 64-bit floating point (FP64) performance, versus 32-bit floating point.  This is the first GPU computing chip to ever offer 1/1 FP64 vs. FP32, however that achievement is dulled by the fact that there are only 8 of these cores per functional unit, meaning an effective speed of 1/4 FP64 per functional unit or 1/24 FP64 per SM.

Still for all its gains in GPU computing, Anandtech's benchmarking shows it to only be roughly on par with AMD's flagship card, winning in some GPUCompute benchmarks, losing in others.  Of course a tie still works in NVIDIA's favor as it has arguably the best supported GPU programming API -- CUDA -- which is slightly easier to learn and master than OpenGL, thanks in part to the large amount of resources and support NVIDIA throws at developers.


III. Buy One if You Can
NVIDIA's card is available today for $500 USD.  NVIDIA is going to tell you that it's the fast card on the market and toss out terms like "revolutionary".  The good news, is that when it comes to gaming it is a solid card, though its less of a revolution and more of a nice iterative bump.

Still, that bump is enough to make it the new king of the graphics market on the high end.

The choice is now easy for customers -- buy a GTX 680.  That's the good news.

The bad news is that the choice may not be that easy.  Anandtech writes that NVIDIA indicated that launch supplies may be slightly scarce.  Thus it's very possible that GTX 680s could be sold out, taking this option off the plate temporarily.

This all gets back to the yield difficulties reportedly experienced by Taiwan Semiconductor Manufacturing Comp., Ltd. (TPE:2330) on their new 28 nm node.  Like AMD, NVIDIA is likely aggressively binning the good chips coming off the line for use in its flagship cards, but the problem is that higher quality 28 nm silicon appears to be having very low yields.  As a result, expect supply of NVIDIA's unannounced lower-end Kepler derivatives to be a bit more liberal, but that they'll have lower clock speeds similar to AMD's chips.

So get your hands on the GTX 680 if you can find one -- it's the best thing you can find -- for now -- until the rumored "Big Kepler" comes along.

View the original article here

0 comments for this post

Post a Comment