5

Wayland

The Wayland protocol [1] is designed to be more secure than X, when X was designed there wasn’t much thought given to the possibility of programs with different access levels displaying on the same desktop. The Xephyr nested X server [2] is good for running an entire session from a remote untrusted host on a local display but isn’t suitable for multiple applications in the same session.

GNOME supported Wayland by default in Debian since the Bullseye release and for KDE support you can install the plasma-workspace-wayland which gives you an option for the session type of KDE Plasma Wayland when you login. For systems which don’t use the KDE Plasma workspace but which have some KDE apps you should install the package qtwayland5 to allow the KDE apps to use the Wayland protocol. See the KDE page of the Debian Wiki [3] for more information.

The Debian Wiki page on Wayland has more useful information [4]. Apparently you have to use gdm instead of sddm to get Wayland for the login prompt.

To get screen sharing working on Wayland (and also to get a system that doesn’t give out error messages) you need to install the pipewire package (see the Pipewire project page for more information [6]).

Daniel Stone gave a great LCA talk about Wayland in 2013 [5].

I have just converted two of my systems to Wayland. It’s pretty uneventful, things seem to work the same way as before. It might be theoretically faster but in practice Xorg was fast enough that there’s not much possibility to appear faster. My aim is to work on Linux desktop security to try and get process isolation similar to what Android does on the PC desktop and on Debian based phones such as the Librem 5. Allowing some protection against graphics based attacks is only the first step towards that goal, but it’s an important step. More blog posts on related topics will follow.

Update: One thing I forgot to mention is that MAC systems need policy changes for Wayland. There are direct changes (allowing background daemons for GPU access to talk to a Wayland server running in a user context instead of an X server in a system context) and indirect changes (having the display server and window manager merged).

3

Some Ideas for Debian Security Improvements

Debian security is pretty good, but there’s always scope for improvement. Here are some ideas that I think could be used to improve things.

  1. A security “wizard”, basically a set of scripts with support for plugins that will investigate your system and look for things that can be improved. It could give suggestions on LSMs that could be used, sysctl settings, lists of daemons running as root that possibly don’t need root privs, etc. Plugins could be for different daemons, so there could be a plugin for Apache that looks for potential issues with Apache configuration. It wouldn’t be possible to cover everything, but it would be possible to cover many common cases.
    It appears that we used to have a “harden” package to do some of these things which disappeared. It appears that the only remnant of that is the hardening-runtime package.
  2. Kali Linux [1] is a distribution designed for penetration testing, I recently tried out many of it’s features and I was very impressed. While I don’t think that the aim should be to copy all Kali features into Debian there are probably some that are worthy of inclusion. Most Kali features run well in a VM, but the Wifi penetration testing tools need access to the hardware, so they could be a good candidate for inclusion in Debian (license permitting).
  3. We have a Securing Debian Manual [2] that is really good. It’s a little out of date and needs some contributions, it also needs to be better known.
  4. The Security Management page of the Debian wiki [3] has links to a number of pages about improving system security. It needs some updates, it doesn’t have a link to a page about SE Linux so there’s some work for me to do there.
  5. Can training help people? I would be happy to run some Debian SE Linux training sessions over Matrix or Jitsi. We can probably find people to offer training on other aspects of Linux security that are implemented in Debian if there is an audience. I don’t think that I and other DDs (Debian Developers) can train everyone, but we could train people who then go on to run other training sessions and make the session notes etc available under the GPL.
    There would also be some benefits to training other DDs as probably no-one has a good overview of all the security features that are supported.

Any other ideas? Feel free to comment here or start a thread on a public mailing list. If you start a mailing list discussion please email me or comment here with the URL if it’s a list that I’m not on so I can track it via the archives. This post was inspired by a discussion on a private list of a related topic. I think it’s better to have a public discussion instead.

Servers and Lockdown

OS security features and server class systems are things that surely belong together. If a program is important enough to buy expensive servers to run it then it’s important enough that you want to have all the OS security features enabled. For such an important program you will also want to have all possible monitoring systems running so you can predict hardware failures etc. Therefore you would expect that you could buy a server, setup the vendor’s management software, configure your Linux kernel with security features such as “lockdown” (a LSM that restricts access to /dev/mem, the iopl() system call, and other dangerous things [1]), and have it run nicely! You will be disappointed if you try doing that on a HP or Dell server though.

HP Problems

[370742.622525] Lockdown: hpasmlited: raw io port access is restricted; see man kernel_lockdown.7

The above message is logged when trying to INSTALL (not even run) the hp-health package from the official HP repository (as documented in my previous blog post about the HP ML-110 Gen9 [2]) with “lockdown=integrity” (the less restrictive lockdown option). Now the HP package in question is in their repository for Debian/Stretch (released in 2017) and the Lockdown LSM was documented by LWN as being released in 2019, so not supporting a Debian/Bullseye feature in Debian/Stretch packages isn’t inherently a bad thing apart from the fact that they haven’t released a new version of that package since. The Stretch package that I am testing now was released in 2019. Also it’s been regarded as best practice to have device drivers for this sort of thing since long before 2017.

# hplog -v

ERROR: Could not open /dev/cpqhealth/cdt.
Please make sure the Health Monitor is started.

Attempting to run the “hplog -v” command (to view the HP hardware log) gives the above error. Strace reveals that it could and did open /dev/cpqhealth/cdt but had problems talking to something (presumably the Health Monitor daemon) over a Unix domain socket. It would be nice if they could at least get the error message right!

Dell Problems

[   13.811165] Lockdown: smbios-sys-info: /dev/mem,kmem,port is restricted; see man kernel_lockdown.7
[   13.820935] Lockdown: smbios-sys-info: raw io port access is restricted; see man kernel_lockdown.7
[   18.118118] Lockdown: dchcfg: raw io port access is restricted; see man kernel_lockdown.7
[   18.127621] Lockdown: dchcfg: /dev/mem,kmem,port is restricted; see man kernel_lockdown.7
[   19.371391] Lockdown: dsm_sa_datamgrd: raw io port access is restricted; see man kernel_lockdown.7
[   19.382147] Lockdown: dsm_sa_datamgrd: /dev/mem,kmem,port is restricted; see man kernel_lockdown.7

Above is a sample of the messages when booting a Dell PowerEdge R710 with “lockdown=integrity” with the srvadmin-omacore package installed from the official Dell repository I describe in my blog post about the Dell PowerEdge R710 [3]. Now that repository is for Ubuntu/Xenial which was released in 2015, but again it was best practice to have device drivers for this many years ago. Also the newest Debian based releases that Dell apparently supports are Ubuntu/Xenial and Debian/Jessie which were both released in 2015.

# omreport system esmlog
Error! No Embedded System Management (ESM) log found on this system.

Above is the result when I try to view the ESM log (the Dell hardware log).

How Long Should Server Support Last?

The Wikipedia List of Dell PowerEdge Servers shows that the R710 is a Generation 11 system. Generation 11 was first released in 2010 and Generation 12 was first released in 2012. Generation 13 was the latest hardware Dell sold in 2015 when they apparently ceased providing newer OS support for Generation 11. Dell currently sells Generation 15 systems and provides more recent support for Generation 14 and Generation 15. I think it’s reasonable to debate whether Dell should support servers for 4 generations. But given that a major selling point of server class systems is that they have long term support I think it would make sense to give better support for this and not drop support when it’s only 2 versions from the latest release! The support for Dell Generation 11 hardware only seems to have lasted for 3 years after Generation 12 was first released. Also it appears that software support for Dell Generation 13 ceased before Generation 14 was released, that sucks for the people who bought Generation 13 when they were new!

HP is currently selling “Gen 10” servers which were first released at the end of 2017. So it appears that HP stopped properly supporting Gen 9 servers as soon as Gen 10 servers were released!

One thing to note about these support times, when the new generation of hardware was officially released the previous generation was still on sale. So while HP Gen 10 servers officially came out in 2017 that doesn’t necessarily mean that someone who wanted to buy a ML-110 Gen10 could actually have done so.

For comparison Red Hat Enterprise Linux has been supported for 4-6 years for every release they made since 2005 and Ubuntu has always had a 5 year LTS support for servers.

How To Do It Properly

The correct way of interfacing with hardware is via a device driver that is supported in the kernel.org tree. That means it goes through the usual kernel source code quality checks which are really good at finding bugs and gives users an assurance that the code won’t cause security problems. Generally nothing about the code from Dell or HP gives me confidence that it should be directly accessing /dev/kmem or raw IO ports without risk of problems.

Once a driver is in the kernel.org tree it will usually stay there forever and not require further effort from the people who submit it. Then it just works for everyone and tends to work with any other kernel features that people use, like LSMs.

If they released the source code to the management programs then it would save them even more effort as they could be maintained by the community.

More EVM

This is another post about EVM/IMA which has it’s main purpose providing useful web search results for problems. However if reading it on a planet feed inspires someone to play with EVM/IMA then that’s good too, it’s interesting technology.

When using EVM/IMA in the Linux kernel if dmesg has errors like “op=appraise_data cause=missing-HMAC” the “missing-HMAC” means that the error code in the kernel source is INTEGRITY_NOLABEL which has a comment “No security.evm xattr“. You can check for the xattr on a file with the following command (this example has the security.evm xattr):

# getfattr -d -m - /etc/fstab 
getfattr: Removing leading '/' from absolute path names
# file: etc/fstab
security.evm=0sAwICqGOsfwCAvgE9y9OP74QxJ/I+3eOSF2n2dM51St98z/7LYHFd9rfGTvssvhTSYL9G8cTdRAH8ozggJu7VCzggW1REoTjnLcPeuMJsrMbW3DwVrB6ldDmJzyenLMjnIHmRDDeK309aRbLVn2ueJZ07aMDcSr+sxhOOAQ/GIW4SW8L1AKpKn4g=
security.ima=0sAT+Eivfxl+7FYI+Hr9K4sE6IieZ+
security.selinux="system_u:object_r:etc_t:s0"

If dmesg has errors like “op=appraise_data cause=invalid-HMAC” the “invalid-HMAC” means that the error code in the kernel source is INTEGRITY_FAIL which has a comment “Invalid HMAC/signature“.

These errors are from the evm_verifyxattr() function in Linux kernel 5.11.14.

The error “evm: HMAC key is not set” means that the evm key is not initialised, this means the key needs to be loaded into the kernel and EVM is initialised by the command “echo 1 > /sys/kernel/security/evm” (or possibly some equivalent from a utility like evmctl). When the key is loaded the kernel gives the message “evm: key initialized” and after that /sys/kernel/security/evm is read-only. If there is something wrong with the key the kernel gives the message “evm: key initialization failed“, it seems that the way to determine if your key is good is to try writing 1 to /sys/kernel/security/evm and see what happens. After that the command “cat /sys/kernel/security/evm” should return “3”.

The Gentoo wiki has good documentation on how to create and load the keys which has to be done before initialising EVM [1]. I’ll write more about that in another post.

IMA/EVM Certificates

I’ve been experimenting with IMA/EVM. Here is the Sourceforge page for the upstream project [1]. The aim of that project is to check hashes and maybe public key signatures on files before performing read/exec type operations on them. It can be used as the next logical step from booting a signed kernel with TPM. I am a long way from getting that sort of thing going, just getting the kernel to boot and load keys is my current challenge and isn’t helped due to the lack of documentation on error messages. This blog post started as a way of documenting the error messages so future people who google errors can get a useful result. I am not trying to document everything, just help people get through some of the first problems.

I am using Debian for my work, but some of this will apply to other distributions (particularly the kernel error messages). The Debian distribution has the ima-evm-utils but no other support for IMA/EVM. To get this going in Debian you need to compile your own kernel with IMA support and then boot it with kernel command-line options to enable IMA, in recent kernels that includes “lsm=integrity” as a mandatory requirement to prevent a kernel Oops after mounting the initrd (there is already a patch to fix this).

If you want to just use IMA (not get involved in development) then a good option would be to use RHEL (here is their documentation) [2] or SUSE (here is their documentation) [3]. Note that both RHEL and SUSE use older kernels so their documentation WILL lead you astray if you try and use the latest kernel.org kernel.

The Debian initrd

I created a script named /etc/initramfs-tools/hooks/keys with the following contents to copy the key(s) from /etc/keys to the initrd where the kernel will load it/them. The kernel configuration determines whether x509_evm.der or x509_ima.der (or maybe both) is loaded. I haven’t yet worked out which key is needed when.

#!/bin/bash

mkdir -p ${DESTDIR}/etc/keys
cp /etc/keys/* ${DESTDIR}/etc/keys

Making the Keys

#!/bin/sh

GENKEY=ima.genkey

cat << __EOF__ >$GENKEY
[ req ]
default_bits = 1024
distinguished_name = req_distinguished_name
prompt = no
string_mask = utf8only
x509_extensions = v3_usr

[ req_distinguished_name ]
O = `hostname`
CN = `whoami` signing key
emailAddress = `whoami`@`hostname`

[ v3_usr ]
basicConstraints=critical,CA:FALSE
#basicConstraints=CA:FALSE
keyUsage=digitalSignature
#keyUsage = nonRepudiation, digitalSignature, keyEncipherment
subjectKeyIdentifier=hash
authorityKeyIdentifier=keyid
#authorityKeyIdentifier=keyid,issuer
__EOF__

openssl req -new -nodes -utf8 -sha1 -days 365 -batch -config $GENKEY \
                -out csr_ima.pem -keyout privkey_ima.pem
openssl x509 -req -in csr_ima.pem -days 365 -extfile $GENKEY -extensions v3_usr \
                -CA ~/kern/linux-5.11.14/certs/signing_key.pem -CAkey ~/kern/linux-5.11.14/certs/signing_key.pem -CAcreateserial \
                -outform DER -out x509_evm.der

To get the below result I used the above script to generate a key, it is the /usr/share/doc/ima-evm-utils/examples/ima-genkey.sh script from the ima-evm-utils package but changed to use the key generated from kernel compilation to sign it. You can copy the files in the certs directory from one kernel build tree to another to have the same certificate and use the same initrd configuration. After generating the key I copied x509_evm.der to /etc/keys on the target host and built the initrd before rebooting.

[    1.050321] integrity: Loading X.509 certificate: /etc/keys/x509_evm.der
[    1.092560] integrity: Loaded X.509 cert 'xev: etbe signing key: 99d4fa9051e2c178017180df5fcc6e5dbd8bb606'

Errors

Here are some of the kernel error messages I received along with my best interpretation of what they mean.

[ 1.062031] integrity: Loading X.509 certificate: /etc/keys/x509_ima.der
[ 1.063689] integrity: Problem loading X.509 certificate -74

Error -74 means -EBADMSG, which means there’s something wrong with the certificate file. I have got that from /etc/keys/x509_ima.der not being in der format and I have got it from a der file that contained a key pair that wasn’t signed.

[    1.049170] integrity: Loading X.509 certificate: /etc/keys/x509_ima.der
[    1.093092] integrity: Problem loading X.509 certificate -126

Error -126 means -ENOKEY, so the key wasn’t in the file or the key wasn’t signed by the kernel signing key.

[    1.074759] integrity: Unable to open file: /etc/keys/x509_evm.der (-2)

Error -2 means -ENOENT, so the file wasn’t found on the initrd. Note that it does NOT look at the root filesystem.

References

1

Yama

I’ve just setup the Yama LSM module on some of my Linux systems. Yama controls ptrace which is the debugging and tracing API for Unix systems. The aim is to prevent a compromised process from using ptrace to compromise other processes and cause more damage. In most cases a process which can ptrace another process which usually means having capability SYS_PTRACE (IE being root) or having the same UID as the target process can interfere with that process in other ways such as modifying it’s configuration and data files. But even so I think it has the potential for making things more difficult for attackers without making the system more difficult to use.

If you put “kernel.yama.ptrace_scope = 1” in sysctl.conf (or write “1” to /proc/sys/kernel/yama/ptrace_scope) then a user process can only trace it’s child processes. This means that “strace -p” and “gdb -p” will fail when run as non-root but apart from that everything else will work. Generally “strace -p” (tracing the system calls of another process) is of most use to the sysadmin who can do it as root. The command “gdb -p” and variants of it are commonly used by developers so yama wouldn’t be a good thing on a system that is primarily used for software development.

Another option is “kernel.yama.ptrace_scope = 3” which means no-one can trace and it can’t be disabled without a reboot. This could be a good option for production servers that have no need for software development. It wouldn’t work well for a small server where the sysadmin needs to debug everything, but when dozens or hundreds of servers have their configuration rolled out via a provisioning tool this would be a good setting to include.

See Documentation/admin-guide/LSM/Yama.rst in the kernel source for the details.

When running with capability SYS_PTRACE (IE root shell) you can ptrace anything else and if necessary disable Yama by writing “0” to /proc/sys/kernel/yama/ptrace_scope .

I am enabling mode 1 on all my systems because I think it will make things harder for attackers while not making things more difficult for me.

Also note that SE Linux restricts SYS_PTRACE and also restricts cross-domain ptrace access, so the combination with Yama makes things extra difficult for an attacker.

Yama is enabled in the Debian kernels by default so it’s very easy to setup for Debian users, just edit /etc/sysctl.d/whatever.conf and it will be enabled on boot.

4

Deleted Mapped Files

On a Linux system if you upgrade a shared object that is in use any programs that have it mapped will list it as “(deleted)” in the /proc/PID/maps file for the process in question. When you have a system tracking the stable branch of a distribution it’s expected that most times a shared object is upgraded it will be due to a security issue. When that happens the reasonable options are to either restart all programs that use the shared object or to compare the attack surface of such programs to the nature of the security issue. In most cases restarting all programs that use the shared object is by far the easiest and least inconvenient option.

Generally shared objects are used a lot in a typical Linux system, this can be good for performance (more cache efficiency and less RAM use) and is also good for security as buggy code can be replaced for the entire system by replacing a single shared object. Sometimes it’s obvious which processes will be using a shared object (EG your web server using a PHP shared object) but other times many processes that you don’t expect will use it.

I recently wrote “deleted-mapped.monitor” for my etbemon project [1]. This checks for shared objects that are mapped and deleted and gives separate warning messages for root and non-root processes. If you have the unattended-upgrades package installed then your system can install security updates without your interaction and then the monitoring system will inform you if things need to be restarted.

The Debian package debian-goodies has a program checkrestart that will tell you what commands to use to restart daemons that have deleted shared objects mapped.

Now to solve the problem of security updates on a Debian system you can use unattended-upgrades to apply updates, deleted-mapped.monitor in etbemon to inform you that programs need to be restarted, and checkrestart to tell you the commands you need to run to restart the daemons in question.

If anyone writes a blog post about how to do this on a non-Debian system please put the URL in a comment.

While writing the deleted-mapped.monitor I learned about the following common uses of deleted mapped files:

  • /memfd: is for memfd https://dvdhrm.wordpress.com/tag/memfd/ [2]
  • /[aio] is for asynchronous IO I guess, haven’t found good docs on it yet.
  • /home is used for a lot of harmless mapping and deleting.
  • /run/user is used for systemd dconf stuff.
  • /dev/zero is different for each map and thus looks deleted.
  • /tmp/ is used by Python (and probably other programs) creates temporary files there for mapping.
  • /var/lib is used for lots of temporary files.
  • /i915 is used by some X apps on systems with Intel video, I don’t know why.
4

Passwords Used by Daemons

There’s a lot of advice about how to create and manage user passwords, and some of it is even good. But there doesn’t seem to be much advice about passwords for daemons, scripts, and other system processes.

I’m writing this post with some rough ideas about the topic, please let me know if you have any better ideas. Also I’m considering passwords and keys in a fairly broad sense, a private key for a HTTPS certificate has more in common with a password to access another server than most other data that a server might use. This also applies to SSH host secret keys, keys that are in ssh authorized_keys files, and other services too.

Passwords in Memory

When SSL support for Apache was first released the standard practice was to have the SSL private key encrypted and require the sysadmin enter a password to start the daemon. This practice has mostly gone away, I would hope that would be due to people realising that it offers little value but it’s more likely that it’s just because it’s really annoying and doesn’t scale for cloud deployments.

If there was a benefit to having the password only in RAM (IE no readable file on disk) then there are options such as granting read access to the private key file only during startup. I have seen a web page recommending running “chmod 0” on the private key file after the daemon starts up.

I don’t believe that there is a real benefit to having a password only existing in RAM. Many exploits target the address space of the server process, Heartbleed is one well known bug that is still shipping in new products today which reads server memory for encryption keys. If you run a program that is vulnerable to Heartbleed then it’s SSL private key (and probably a lot of other application data) are vulnerable to attackers regardless of whether you needed to enter a password at daemon startup.

If you have an application or daemon that might need a password at any time then there’s usually no way of securely storing that password such that a compromise of that application or daemon can’t get the password. In theory you could have a proxy for the service in question which runs as a different user and manages the passwords.

Password Lifecycle

Ideally you would be able to replace passwords at any time. Any time a password is suspected to have been leaked then it should be replaced. That requires that you know where the password is used (both which applications and which configuration files used by those applications) and that you are able to change all programs that use it in a reasonable amount of time.

The first thing to do to achieve this is to have one password per application not one per use. For example if you have a database storing accounts used for a mail server then you would be tempted to have an outbound mail server such as Postfix and an IMAP server such as Dovecot both use the same password to access the database. The correct thing to do is to have one database account for the Dovecot and another for Postfix so if you need to change the password for one of them you don’t need to change passwords in two locations and restart two daemons at the same time. Another good option is to have Postfix talk to Dovecot for authenticating outbound mail, that means you only have a single configuration location for storing the password and also means that a security flaw in Postfix (or more likely a misconfiguration) couldn’t give access to the database server.

Passwords Used By Web Services

It’s very common to run web sites on Apache backed by database servers, so common that the acronym LAMP is widely used for Linux, Apache, Mysql, and PHP. In a typical LAMP installation you have multiple web sites running as the same user which by default can read each other’s configuration files. There are some solutions to this.

There is an Apache module mod_apparmor to use the Apparmor security system [1]. This allows changing to a specified Apparmor “hat” based on the URI or a specified hat for the virtual server. Each Apparmor hat is granted access to different files and therefore files that contain passwords for MySQL (or any other service) can be restricted on a per vhost basis. This only works with the prefork MPM.

There is also an Apache module mpm-itk which runs each vhost under a specified UID and GID [2]. This also allows protecting sites on the same server from each other. The ITK MPM is also based on the prefork MPM.

I’ve been thinking of writing a SE Linux MPM for Apache to do similar things. It would have to be based on prefork too. Maybe a change to mpm-itk to support SE Linux context as well as UID and GID.

Managing It All

Once the passwords are separated such that each service runs with minimum privileges you need to track and manage it all. At the simplest that needs a document listing where all of the passwords are used and how to change them. If you use a configuration management tool then that could manage the passwords. Here’s a list of tools to manage service passwords in tools like Ansible [3].

BTRFS and SE Linux

I’ve had problems with systems running SE Linux on BTRFS losing the XATTRs used for storing the SE Linux file labels after a power outage.

Here is the link to the patch that fixes this [1]. Thanks to Hans van Kranenburg and Holger Hoffstätte for the information about this patch which was already included in kernel 4.16.11. That was uploaded to Debian on the 27th of May and got into testing about the time that my message about this issue got to the SE Linux list (which was a couple of days before I sent it to the BTRFS developers).

The kernel from Debian/Stable still has the issue. So using a testing kernel might be a good option to deal with this problem at the moment.

Below is the information on reproducing this problem. It may be useful for people who want to reproduce similar problems. Also all sysadmins should know about “reboot -nffd”, if something really goes wrong with your kernel you may need to do that immediately to prevent corrupted data being written to your disks.

The command “reboot -nffd” (kernel reboot without flushing kernel buffers or writing status) when run on a BTRFS system with SE Linux will often result in /var/log/audit/audit.log being unlabeled. It also results in some systemd-journald files like /var/log/journal/c195779d29154ed8bcb4e8444c4a1728/system.journal being unlabeled but that is rarer. I think that the same
problem afflicts both systemd-journald and auditd but it’s a race condition that on my systems (both production and test) is more likely to affect auditd.

root@stretch:/# xattr -l /var/log/audit/audit.log 
security.selinux: 
0000   73 79 73 74 65 6D 5F 75 3A 6F 62 6A 65 63 74 5F    system_u:object_ 
0010   72 3A 61 75 64 69 74 64 5F 6C 6F 67 5F 74 3A 73    r:auditd_log_t:s 
0020   30 00                                              0.

SE Linux uses the xattr “security.selinux”, you can see what it’s doing with xattr(1) but generally using “ls -Z” is easiest.

If this issue just affected “reboot -nffd” then a solution might be to just not run that command. However this affects systems after a power outage.

I have reproduced this bug with kernel 4.9.0-6-amd64 (the latest security update for Debian/Stretch which is the latest supported release of Debian). I have also reproduced it in an identical manner with kernel 4.16.0-1-amd64 (the latest from Debian/Unstable). For testing I reproduced this with a 4G filesystem in a VM, but in production it has happened on BTRFS RAID-1 arrays, both SSD and HDD.

#!/bin/bash 
set -e 
COUNT=$(ps aux|grep [s]bin/auditd|wc -l) 
date 
if [ "$COUNT" = "1" ]; then 
 echo "all good" 
else 
 echo "failed" 
 exit 1 
fi

Firstly the above is the script /usr/local/sbin/testit, I test for auditd running because it aborts if the context on it’s log file is wrong. When SE Linux is in enforcing mode an incorrect/missing label on the audit.log file causes auditd to abort.

root@stretch:~# ls -liZ /var/log/audit/audit.log 
37952 -rw-------. 1 root root system_u:object_r:auditd_log_t:s0 4385230 Jun  1 
12:23 /var/log/audit/audit.log

Above is before I do the tests.

while ssh stretch /usr/local/sbin/testit ; do 
 ssh stretch "reboot -nffd" > /dev/null 2>&1 & 
 sleep 20 
done

Above is the shell code I run to do the tests. Note that the VM in question runs on SSD storage which is why it can consistently boot in less than 20 seconds.

Fri  1 Jun 12:26:13 UTC 2018 
all good 
Fri  1 Jun 12:26:33 UTC 2018 
failed

Above is the output from the shell code in question. After the first reboot it fails. The probability of failure on my test system is greater than 50%.

root@stretch:~# ls -liZ /var/log/audit/audit.log  
37952 -rw-------. 1 root root system_u:object_r:unlabeled_t:s0 4396803 Jun  1 12:26 /var/log/audit/audit.log

Now the result. Note that the Inode has not changed. I could understand a newly created file missing an xattr, but this is an existing file which shouldn’t have had it’s xattr changed. But somehow it gets corrupted.

The first possibility I considered was that SE Linux code might be at fault. I asked on the SE Linux mailing list (I haven’t been involved in SE Linux kernel code for about 15 years) and was informed that this isn’t likely at
all. There have been no problems like this reported with other filesystems.

2

Compromised Guest Account

Some of the workstations I run are sometimes used by multiple people. Having multiple people share an account is bad for security so having a guest account for guest access is convenient.

If a system doesn’t allow logins over the Internet then a strong password is not needed for the guest account.

If such a system later allows logins over the Internet then hostile parties can try to guess the password. This happens even if you don’t use the default port for ssh.

This recently happened to a system I run. The attacker logged in as guest, changed the password, and installed a cron job to run every minute and restart their blockchain mining program if it had been stopped.

In 2007 a bug was filed against the Debian package openssh-server requesting that the AllowUsers be added to the default /etc/ssh/sshd_config file [1]. If that bug hadn’t been marked as “wishlist” and left alone for 11 years then I would probably have set it to only allow ssh connections to the one account that I desired which always had a strong password.

I’ve been a sysadmin for about 25 years (since before ssh was invented). I have been a Debian Developer for almost 20 years, including working on security related code. The fact that I stuffed up in regard to this issue suggests that there are probably many other people making similar mistakes, and probably most of them aren’t monitoring things like system load average and temperature which can lead to the discovery of such attacks.