Archives

Categories

Servers and Lockdown

OS security features and server class systems are things that surely belong together. If a program is important enough to buy expensive servers to run it then it’s important enough that you want to have all the OS security features enabled. For such an important program you will also want to have all possible monitoring systems running so you can predict hardware failures etc. Therefore you would expect that you could buy a server, setup the vendor’s management software, configure your Linux kernel with security features such as “lockdown” (a LSM that restricts access to /dev/mem, the iopl() system call, and other dangerous things [1]), and have it run nicely! You will be disappointed if you try doing that on a HP or Dell server though.

HP Problems

[370742.622525] Lockdown: hpasmlited: raw io port access is restricted; see man kernel_lockdown.7

The above message is logged when trying to INSTALL (not even run) the hp-health package from the official HP repository (as documented in my previous blog post about the HP ML-110 Gen9 [2]) with “lockdown=integrity” (the less restrictive lockdown option). Now the HP package in question is in their repository for Debian/Stretch (released in 2017) and the Lockdown LSM was documented by LWN as being released in 2019, so not supporting a Debian/Bullseye feature in Debian/Stretch packages isn’t inherently a bad thing apart from the fact that they haven’t released a new version of that package since. The Stretch package that I am testing now was released in 2019. Also it’s been regarded as best practice to have device drivers for this sort of thing since long before 2017.

# hplog -v

ERROR: Could not open /dev/cpqhealth/cdt.
Please make sure the Health Monitor is started.

Attempting to run the “hplog -v” command (to view the HP hardware log) gives the above error. Strace reveals that it could and did open /dev/cpqhealth/cdt but had problems talking to something (presumably the Health Monitor daemon) over a Unix domain socket. It would be nice if they could at least get the error message right!

Dell Problems

[   13.811165] Lockdown: smbios-sys-info: /dev/mem,kmem,port is restricted; see man kernel_lockdown.7
[   13.820935] Lockdown: smbios-sys-info: raw io port access is restricted; see man kernel_lockdown.7
[   18.118118] Lockdown: dchcfg: raw io port access is restricted; see man kernel_lockdown.7
[   18.127621] Lockdown: dchcfg: /dev/mem,kmem,port is restricted; see man kernel_lockdown.7
[   19.371391] Lockdown: dsm_sa_datamgrd: raw io port access is restricted; see man kernel_lockdown.7
[   19.382147] Lockdown: dsm_sa_datamgrd: /dev/mem,kmem,port is restricted; see man kernel_lockdown.7

Above is a sample of the messages when booting a Dell PowerEdge R710 with “lockdown=integrity” with the srvadmin-omacore package installed from the official Dell repository I describe in my blog post about the Dell PowerEdge R710 [3]. Now that repository is for Ubuntu/Xenial which was released in 2015, but again it was best practice to have device drivers for this many years ago. Also the newest Debian based releases that Dell apparently supports are Ubuntu/Xenial and Debian/Jessie which were both released in 2015.

# omreport system esmlog
Error! No Embedded System Management (ESM) log found on this system.

Above is the result when I try to view the ESM log (the Dell hardware log).

How Long Should Server Support Last?

The Wikipedia List of Dell PowerEdge Servers shows that the R710 is a Generation 11 system. Generation 11 was first released in 2010 and Generation 12 was first released in 2012. Generation 13 was the latest hardware Dell sold in 2015 when they apparently ceased providing newer OS support for Generation 11. Dell currently sells Generation 15 systems and provides more recent support for Generation 14 and Generation 15. I think it’s reasonable to debate whether Dell should support servers for 4 generations. But given that a major selling point of server class systems is that they have long term support I think it would make sense to give better support for this and not drop support when it’s only 2 versions from the latest release! The support for Dell Generation 11 hardware only seems to have lasted for 3 years after Generation 12 was first released. Also it appears that software support for Dell Generation 13 ceased before Generation 14 was released, that sucks for the people who bought Generation 13 when they were new!

HP is currently selling “Gen 10” servers which were first released at the end of 2017. So it appears that HP stopped properly supporting Gen 9 servers as soon as Gen 10 servers were released!

One thing to note about these support times, when the new generation of hardware was officially released the previous generation was still on sale. So while HP Gen 10 servers officially came out in 2017 that doesn’t necessarily mean that someone who wanted to buy a ML-110 Gen10 could actually have done so.

For comparison Red Hat Enterprise Linux has been supported for 4-6 years for every release they made since 2005 and Ubuntu has always had a 5 year LTS support for servers.

How To Do It Properly

The correct way of interfacing with hardware is via a device driver that is supported in the kernel.org tree. That means it goes through the usual kernel source code quality checks which are really good at finding bugs and gives users an assurance that the code won’t cause security problems. Generally nothing about the code from Dell or HP gives me confidence that it should be directly accessing /dev/kmem or raw IO ports without risk of problems.

Once a driver is in the kernel.org tree it will usually stay there forever and not require further effort from the people who submit it. Then it just works for everyone and tends to work with any other kernel features that people use, like LSMs.

If they released the source code to the management programs then it would save them even more effort as they could be maintained by the community.

Comments are closed.