Tower Servers and Resizable BAR

Categories :

A feature on modern PCIe implementations is “Resizable BAR” AKA “REBAR”. This basically means that instead of allocating 256MB of address space for a PCIe device to have it’s memory mapped the device can ask for more, the limit can be 4G with some hardware or the combination of motherboard and expansion card can support 64bit addressing to allow the entire memory space of a GPU to be mapped in one region. Directly mapping all the memory will be faster no matter how things work, but a combination of algorithms optimised for a flat memory layout and overheads from remapping can cause 90% of performance to be lost without REBAR support. Some GPUs (or maybe the software driving them) will even refuse to work without it.

I believe that almost all hardware supporting DDR4 will support REBAR at a hardware level, but in many cases the BIOS doesn’t support it. There are people who have reflashed a system BIOS to add REBAR support and there are options to use a modified UEFI boot loader to replace the code that is used for mapping the GPU memory.

The systems I like to use are server grade tower systems with registered ECC RAM, after a few years they become quite cheap and still give decent performance while supporting large amounts of RAM. But many such systems that could support REBAR don’t, presumably because the vendor doesn’t have a great interest in supporting new uses of old hardware.

Comparing the Name Brand Servers

The HP Z640 and Z840 systems I’m running date from 2014 and give good performance with replacement CPUs that are cheap on ebay, but they don’t support REBAR without a flashed BIOS. The next release of those HP servers are the HP Z6 and Z8 Gen 4 systems from 2017 that have BIOS support for enabling REBAR.

The Lenovo Thinkstation Px20 (P520, P920, etc) don’t support REBAR which is especially disappointing as they were on sale from 2017 to 2022 and have decently fast CPUs. The replacement for the Px20 systems are the ones that are still on sale now and they seem likely to have REBAR support – but won’t be affordable on ebay.

The Dell PowerEdge T440 and R740 systems (and presumably all their servers from 2017) don’t support REBAR. There are no google hits for T550 and R750 systems from 2021, so presumably no complaints means that Dell servers from that era support it. But the T350 servers are junk and only take slow CPUs, and the T550 systems are brutally expensive. The Precision 5520 systems don’t support it and newer Precision workstations will get expensive.

It seems that HP is best for this.

Which HP Workstation

The Z2 G4 only supports 64G of RAM so isn’t worth considering.

The Z4 G4 is low end and comes in two variants. The one with i5/i7/i9 CPUs doesn’t support ECC RAM so isn’t suitable for me, and that probably means most Z4 G4 systems on the market. The upside is that apparently 2*6pin PCIe power cables is standard so any size GPU should work and there are 8 DIMM slots supporting up to 512G of RAM. There are 3 options for PSU, 490w for 0 GPUs, 750W for 2 (small) GPUs, and 1000W for up to 4 GPUs.

The Z6 G4 has an option for a second CPU that almost no-one selects, that reduces the space for RAM so there’s only 6 DIMM slots. But as there is no option for a Z6 without ECC RAM every one on offer will be good.

The Z8 G4 is a nice dual socket system that I would not use for a serious GPU after my experience of my Z840 having a motherboard problem from a big GPU.

The Z4 G4 is going for about $500 on ebay with the 750W PSU, that is more than I want to pay but not a lot more. In 6 months they could be going for $350 or so. There are hardly any Z6 G4 systems on offer and they are all well over $1000 so I’m not considering them.

Conclusion

I need to poll the second hand sites for Z4 G4 systems and find one going cheap. One of those could be a good ML test machine for a while and then become a workstation once the faster CPUs (which are currently around $900) become cheap.

Leave a Reply