LMBench Performance
LMBench 3.0 tests 32-bit instruction latency performance in its current release, essentially benchmarking the CPU and nothing else. While these tests are useful, they only show performance on a small part of a complete system. I've consciously stayed away from writing this article that way, instead wishing to comment on Opteron as a platform, so I won't go too in depth in the commentary. Integer tests first.With the Intel CPU's ALU units running at twice clock speed, I was expecting them to do a fair bit better than the Opteron in terms of raw instruction performance. It doesn't seem to be the case, with even the longest latency instruction being less than 10% faster than the Opteron. 1.8GHz integer unit versus 6.12GHz integer unit. Maybe I'm interpreting the data wrongly or LMBench isn't setup correctly to measure it properly, but it's pretty neck and neck.
More of the same, but this time the obviously strong Opteron FPU takes home more cookies. Maybe LMBench is correct on the integer results? Double precision (64-bit) floating point now.
It's essentially the same graph as the float test. It's also important to note that even though we are operating on 64-bit data, the Opteron is doing it in 32-bit compatibility mode. We know the restrictions that places on the CPU in that mode (it becomes a 1MB K7), so its no surprise that the graph is the same as the single precision float test.
On to the conclusions.