Armari RM-O64-1AE HPC Opteron Server
A UK based vendor called Armari have been excellent in working with us so that we can cover Opteron. They are a long time supplier of the highest performing x86 computing available at any given time to the UK. If you think of performance x86 systems in the UK, Armari are always on any given list of vendors you'll come up with.They've been an Opteron launch partner since day one, with a wide range of Opteron systems available, including 1U, 2U and 4U rack mounted SMP systems for server use, along with uniprocessor and SMP workstations systems for high end desktop use.
It's their RM-O64-1AE system that we were lucky enough to obtain for evaluation. It's a 1U contained, dual processor Opteron system with an IDE disk subsystem, configured for its main use in HPC clustering systems. HPC stands for high performance computing and these 1U rack based units are designed for use in a clustered environment. Think of a 40U rack with 40 dual Opteron RM-O64-1AE systems, connected to a NAS disk store, all functioning essentially as a single 80 processor computer. That's the normal use for these boxes. Ultra high performance clustered mini super computers effectively.
Let's take a look at the RM-O64-1AE in more detail. Here's the spec.
[*] 2 x AMD Opteron 244 processors (1MB L2, 1.8GHz)
[*] Rioworks HDAMA motherboard
[---] Dual AMD® Opteron™ uPGA Socket 940 CPUs
[---] Full 6.4GB HyperTransport connections
[---] 4 184-pin DDR DIMM slots per processor (8GB DDR333 maximum, per CPU, 16GB system max)
[---] Supports registered ECC DDR memory only
[---] AMD® 8111™ [HyperTransport I/O Hub]
[---] AMD® 8131™ [PCI-X Tunnel]
[---] Dual Broadcom 5702 Gigabit Ethernet controllers
[---] Promise PDC20319 Serial ATA controller
[---] ATI RageXL with 8MB video memory
[---] 2 PCI-X (64bit/100MHz) slots
[---] 2 PCI (64bit/66MHz) slots
[---] 2 PCI (32bit/33MHz) slots
[*] 8GB registered ECC DDR333 memory, 4 x 1GB per processor
[*] Western Digital 40GB ATA100 7200rpm WD400JB
[*] Rioworks R3122 Slimline CDROM
[*] Rioworks R3122 Slimline FDD
As you can see, it's a fully featured dual processor server configured for use in HPC clustered environments. The 40GB IDE disk seems like a limitation for such a highly specified computer, but given the target use, where the main data store isn't local to the individual computers in the cluster, a fast 40GB drive to hold the operating system and small amount of local data is ideal.
The 8131 PCI-X tunnel provides two PCI-X segments on 2 seperate bridges, connected to processor 2. The CPU facing side of the HyperTransport tunnel on the 8131 is a full 16-bit link, and the 'B' side of the tunnel is only 8-bits wide, effectively halving the bandwidth. Each PCI-X bridge on the 8131 supports 133, 100 and 66MHz operation in PCI-X mode, 66 and 33Mhz in PCI mode, with either 32-bit or 64-bit PCI modes available (PCI-X is always 64-bit).
The 8111 IO hub connects to the 8-bit 'B' side of the 8131. The native bus width of the 8111 is 8-bit, so no bandwidth is lost. The 8111 supports an 8 device 32-bit, 33MHz PCI 2.2 bus, gives two 100MBit Ethernet MAC's, 6 ports of USB2.0, along with other peripheral bus interfaces usually contained on the southbridge of a regular x86 PC system. Think of the 8111 as a southbridge in the traditional sense, since it has no outbound HyperTransport connection and was specifically designed to sit on the other side of a high performance I/O ASIC like the 8131 or 8151 AGP tunnel. Effectively it terminates the HyperTransport bus that it sits on.
The Broadcom controllers sit on the 'A' segment of the 8131 tunnel along with the 2 PCI-X slots provided on the board, with the RageXL and Promise Serial ATA controller sitting on the PCI bus provided by the 'B' segment of the 8131. All correctly partitioned for maximum performance.
With processor 1 connected to processor 2, it communicates with the 8131 (and connected devices) and 8111 via processor 2's HT link. Also, in other motherboards, only one processor might have memory able to be attached to it, but that's not the case with the HDAMA. Each processor has 4GB of attached memory in the shipping configuration from Armari, meaning no memory penalties from having to communicate with another CPU's memory since it doesn't have its own. With NUMA, that doesn't really matter, since the penalty is as low as possible, indeed the test operating system was NUMA aware, but that not the case with the HDAMA anyway.
Being an Opteron system, each CPU has to have 2 DIMM's minimum to satisfy the requirements of the memory controller.
All in all, one of the most powerful shipping Opteron systems you could buy at the time we had it for testing (244 was the fastest dual processor Opteron, 8GB of memory more than enough for our testing).
What does all that wrapped up in a 1U rackmount chassis look like?
Here's a small gallery of extra shots for the inquisitive:
DC blowers for CPU and memory cooling
CPU heatsinks with cardboard shrouds to aid airflow
Naked copper cooling
PhoenixBIOS showing 2 x 1.8GHz Opteron and 8GB of installed memory