A client is considering some options for serious deployment of some CPU intensive work. The options that are being considered include cloud computing (Amazon EC2 [1]), virtual machines (Slicehost [2] and Linode [3]), and purchasing servers to install in racks at various locations. I can’t disclose the criteria that will determine when each of those three options will be used (I expect that we will end up using all of them). But my research on the prices of various servers will hopefully be useful to someone.
For the server vendor I chose Dell. I believe that HP offers slightly better quality hardware than Dell, but they cost more and are more difficult to deal with (I can’t even get a price online). For this project I will be using a bunch of redundant servers (in a similar concept to the Google server array) so I’m not going to be overly bothered about losing a server occasionally – therefore the slight benefit that HP offers for reliability does not make up for the expense.
Dell has some 1RU servers that have two CPU sockets and allow eight CPU cores. It seems that the best value that Dell offers for a server without RAID (the entire server is redundant) is a PowerEdge SC1435 that has two Opteron 2352 quad-core CPUs running at 2.1GHz, 4G of RAM, a 1TB SATA disk, and a Broadcom PCIe Gig-e card for $3,816.50. That machine gives an option of 2.3GHz CPUs for an extra $621.50, I am not sure that increasing the clock speed by almost 10% for a 16% increase in system price is a good idea.
The second best option was a PowerEdge 1950 III that has two Xeon E5420 2.5GHz quad-core CPUs with 12M of cache, 4G of RAM and a 1TB SATA disk for $4,302.30. The Intel option has 3 years of support included while the AMD option included 1 year of support and needed at least an extra $990 for 3 years of support. So it seems that if 3 years of support is desired then the Intel based server becomes significantly cheaper and is probably a better option.
Dell’s 2RU and 4RU servers are of no interest if you want CPU performance. The 2RU servers only support two processors and the 4RU servers only support four processors. So it’s a ratio of 2 processors per RU for 1RU servers vs one processor per RU for 2RU and 4RU servers, and the 2RU and 4RU servers are a lot more expensive too.
I am investigating the Dell blade server. Blade servers are great for CPU density and good for management. The Dell blade enclosure M1000e takes 10RU of space and supports 16 half-height blades or 8 full-height blades. The Dell M905 blade supports four AMD quad-core processors for a total of 128 cores in 10RU, there are also half-height blades that support two quad-core processors for the same CPU density.
So in terms of CPU density it’s an average of 12.8 cores per RU for the blade server vs 8 cores per RU for 1RU servers. While I haven’t got a complete price yet, it seems that four CPUs suitable for the M905 will cost about as much as four 1RU servers. So the 1RU systems are definitely better value for money than the blade server. The difference is the management cost. N servers that have two CPUs will be more work than N/2 servers that have four CPUs, but on the other hand blade servers require some specialised skills to run them (which I don’t have) and that might also cause problems. I don’t think that blades will be part of this project.

“I am not sure that increasing the clock speed by almost 10% for a 16% increase in system price is a good idea.”
That probably depends on your operating costs and how long the project will run.
If you can wait just a while longer, you may want to consider the new Dell PowerEdge R710. It’s a 2RU Server with 2 Processor Sockets, and the new DRAC6 seems more useful in general and more Linux-friendly in particular than the DRAC5 which comes with e.g. the 1950III.
Sun have some fairly dense servers – the X4100M2 and X4140 have 2 quad core Opterons in 1RU, the X4150 has 2 quad core Xeons in 1RU, the X4440 has 4 quad core Opterons in 2RU, the X4450 has 4 quad core Xeons in 2RU and the X4600M2 has 8 quad core Opterons in 4RU. Prices are on the website, and you can also try up to 10 servers free for 60 days then buy at 20% off.
Are you really never hitting memory? If you *are* hitting memory, be sure to consider the memory bus available to each core. A frightful number of HPC applications see better performance with dual-core rather than quad-core. They tend to be somewhat synchronized and hit memory at about the same time.
Don’t underestimate the manageability advantages of blades. Having worked with blade servers and non-blade servers, the blade servers proved far easier to remotely fix when we broke them and to remotely debug our software on. (This occurred in the context of a lab full of machines on which we did kernel development and thus frequently installed new kernels.)
Some additional notes about specific hardware you mentioned:
Avoid Broadcom NICs like the plague. I’ve run into numerous reliability problems with them.
Consider the Intel option for more than just the 3-year warranty. First of all, as that warranty implies, Intel hardware often proves more reliable. I don’t just mean their processors; Intel’s boards, chipsets, and so on typically implement specifications more closely without cutting corners. (Random concrete example I discovered recently: the SATA spec supports port multipliers, but pretty much no on-board SATA chipsets work with them other than Intel’s.) Second, for the price difference you will get significantly more of a performance benefit than the clock-frequency difference suggests; the Core architecture really rocks. Third, power usage matters: you (or the people you colocate your server with) can’t always fill a rack with 1U servers due to limitations on how much power each server can draw.
Regarding prices, I can’t seem to figure out how you managed to get the prices you did; I tried configuring a system to match the specifications you gave for the Intel system, and it came out significantly lower, in the ~$2400 range. Mind posting a detailed list of all the specifications you used?
I, on the other hand, haven’t seen many problems with the onboard Broadcom NICs on PowerEdge servers, and we have a lot of them.
Avoid the sc1435s, they don’t have dual power supplies or the ability to add a DRAC(which may not matter if you are using serial console on all).
I prefer HP equipment for numerous reasons, and because we used to get competitive pricing from our supplier I would always buy them over Dell.
The only downside was having to assemble the HP servers ourselves
James: Good idea, I will check out what Sun has to offer.
anon: Interesting idea. I’m really not sure how memory intensive the programs are. How would you measure this (apart from just benchmarking on a bunch of different CPUs)? Maybe I could test on a system that gives better memory performance when DIMMs are paired and see how unpaired memory affects the application performance.
Anonymous: You have a good point about management, but it seems that you needed it more than me. I’m working on a server appliance that will be redundant. If one breaks it can be removed and replaced.
Your point about expected reliability is more compelling for me. I’m happy to haul servers out of the DC before diagnosing the problem, but I don’t want to do it every day when I have many servers.
Good point about power usage. But it seems that the vendors don’t give adequate information to determine the power use. When comparing some of the Dell systems I can see that the SC1435 has a 600W PSU and the PowerEdge 1950 has a 670W PSU. So it seems likely that the SC1435 will use less power.
Regarding prices, I forgot to mention that I used Australian dollars… The difference between your result and mine seems to approximate the difference between US and Australian dollars with some shipping expenses thrown in.
anon: Dual PSUs doesn’t matter that much in this case. I’d rather have two complete servers than one server with two of each part…
packeteer: I agree that HP make good gear. I just find it difficult to deal with them. For small sites (one or two servers) I buy refurbished HP servers at auction, cheap, fast, and easy.
If HP really wants my business they will create a decent web site with some prices.
I can only add 2 cents here.
If the ability to mananage and monitor the server at the hardware level a HP solution is significantly ahead and very simple to installa and administer. Depending on what applciation monitoring solution you may need the SIM stuff pretty much integrates with most of them.
Get a proper HP account manager either direct or via a partner and depending on your needs and relationship etc HP can be very competative. Dell is likely cheaper but as you mentioned the HP kit is a better quality. This then gets back to life span of the applciation, warranty support etc if this is meaningful