Shotgun processor wedding?
AMD's new platform has been informally dubbed either 4x4 or Quadfather, but it is now officially called Quad FX. The 4x4 name was derived from the ability to have four processor and four graphics cores in one system. To achieve this, AMD has made a curious hybrid of desktop and workstation technologies. Like the latest Opterons, it's based on a pair of 1,207-contact LGA sockets, although the name of these has changed from Socket F to Socket L1FX. But it doesn't use NVIDIA's nForce Professional 3000-series chipset, instead calling on a variant of NVIDIA's consumer-oriented 600 series. This isn't such a new idea, as there are dual Opteron Socket 940 workstation boards on the market which use NVIDIA's nForce 4, such as MSI's K8N Master2-FAR. The 600-series chipset also means Quad FX can use regular DDR2 memory rather than registered ECC DIMMs, opening the platform up to mainstream high-performance DIMMs and overclocking.
NVIDIA's nForce 680a SLI chipset is not really that far off the nForce 680i SLI recently released for Intel processors, either. The biggest difference revolves around Intel's use of a Front Side Bus versus AMD's incorporation of an on-die memory controller. Two distinct chips are used in nForce 680i SLI. The SPP integrates a memory controller and connects to the Intel CPU via a Front Side Bus. This hosts one of the PCI Express 16x slots and the auxiliary PCI Express slots. It then connects via HyperTransport to the MCP, which hosts the second PCI Express 16x slot, the third PCI Express 16x slot which only runs at 8x electrically, and all the peripheral connections such as PCI, SATA, USB, Ethernet LAN and integrated audio. This configuration is basically the same as the traditional Northbridge/Southbridge chipsets of the last decade. The SPP takes the role of the Northbridge and the MCP is essentially a Southbridge, except that the second and third graphics slots also hang off the MCP, as we just described.
Since AMD's processors don't use a Front Side Bus, have a memory controller built in, and already communicate with the outside world via HyperTransport, they don't need a Northbridge. All they require is an MCP for a basic system without SLI. For more PCI Express lanes, a couple of chips can be daisy-chained together, which is what a number of Nvidia chipsets already do. The new nForce 680a SLI does this too, using two virtually identical MCP-like chips. Both drive two PCI Express 16x slots each, with one of each pair operating at 8x electrically (making a total of four). They join forces to provide support for four Gigabit Ethernet connections, up to 20 USB 2 ports, and 12 SATA-2 connections running at 3Gbits/sec. One chip takes care of legacy PCI slots, whilst the other handles any non-graphics PCI Express.
However, if you look closely at the architectural diagram for nForce 680a SLI you will notice that rather than running its two chips serially, as with 680i SLI, each talks directly to one of the processors via its own dedicated HyperTransport link. This is one of the reasons why AMD has based Quad FX on its Opteron platform. The AMD64 core has been built with three HyperTransport links since launch, but in a consumer single-processor configuration only one is used. The trio of links only come into their own with a multi-processor Opteron configuration.
Aside from two of those links allowing the two 680a chips (and their attached graphics cards) to be addressed independently, the third has special characteristics. It's called a coherent HyperTransport link because it can be used to transfer cache coherency information using the MOESI system which AMD operates in all its 64-bit processors. This is a very important feature, and is one of the capabilities which still gives AMD an advantage over Intel.
Since every AMD chip has its own memory controller, if you are running two of them you have two memory controllers. Like any self-respecting dual-Opteron motherboard, the Quad FX platform has two memory banks - one for each chip, as you can see above. Populate both, and you can theoretically take advantage of twice the memory bandwidth. Using the coherent HyperTransport link, one processor can access the memory controller on the other processor, and access data from its bank of DDR2. But it can also request data from the L1 and L2 caches on the other processor's two cores, which will be quicker than a call to main memory.
In contrast, Intel's Core 2 Quad incorporates two dual-core dies which can't access each-other's Level 2 cache directly. Although each dual-core die has a shared Level 2 cache pool, the two separate dies can only exchange data slowly via main memory, despite sitting next to each other.
The dual independent memory controllers and memory banks of Quad FX turn it into a Non-Uniform Memory Architecture (NUMA). For this to work properly, however, you need your operating system to be properly aware. On Windows XP SP2, a /PAE switch is required in the boot.ini for support. What this does is prevent an application from loading data into the memory bank of one processor, then try to access it from the other, which is clearly not going to give you the best performance. However, Windows XP is still considered a bit of a hack where NUMA is concerned, and only Windows Vista has full native support.
This leaves a slight question mark over AMD's strategy, thanks to Microsoft's product positioning of Vista. The benefits for NUMA of running Vista won't be such an issue for enthusiasts, who will probably be leaping onto the new OS as early adopters anyway. More worrying is the fact that Windows Vista Home Premium only supports a single physical processor, so you will have to go for the much more expensive Ultra version. In contrast, Intel's Core 2 Quad will be fine with Premium, because its four cores still count as one physical processor for Microsoft's licensing scheme.
However, other cost worries about Quad FX seem unfounded. AMD has told us that the platform will not be more expensive than Intel's quad-core. Since both nForce 680i SLI and 680a SLI use similar chipsets, there won't be much difference in motherboard pricing. More importantly, AMD explained that the new Quad FX processors will be retailing in packages of matched pairs, the most expensive of which (the FX-74) will correspond in price to Intel's QX6700. So for around the same money you could either buy one Intel processor with four cores, or get two AMD ones with a pair of cores each.
In other words, it's all down to the performance. Here, Quad FX should at least put in a better showing than the top Socket AM2 part, the Athlon FX-62. The latter runs at 2.8GHz, but the flagship Quad FX Athlon FX-74 pushes the clocks to 3GHz, which should give it a little more of an edge. There are cheaper FX-70 and FX-72 variants which run at 2.6GHz and 2.8GHz respectively. But all are still 90nm parts, not the 65nm versions which are now starting to trickle out.