Process Monitoring

Since forking the Mon project to etbemon [1] I’ve been spending a lot of time working on the monitor scripts. Actually monitoring something is usually quite easy, deciding what to monitor tends to be the hard part. The process monitoring script ps.monitor is the one I’m about to redesign.

Here are some of my ideas for monitoring processes. Please comment if you have any suggestions for how do do things better.

For people who don’t use mon, the monitor scripts return 0 if everything is OK and 1 if there’s a problem along with using stdout to display an error message. While I’m not aware of anyone hooking mon scripts into a different monitoring system that’s going to be easy to do. One thing I plan to work on in the future is interoperability between mon and other systems such as Nagios.

Basic Monitoring

ps.monitor tor:1-1 master:1-2 auditd:1-1 cron:1-5 rsyslogd:1-1 dbus-daemon:1- sshd:1- watchdog:1-2

I’m currently planning some sort of rewrite of the process monitoring script. The current functionality is to have a list of process names on the command line with minimum and maximum numbers for the instances of the process in question. The above is a sample of the configuration of the monitor. There are some limitations to this, the “master” process in this instance refers to the main process of Postfix, but other daemons use the same process name (it’s one of those names that’s wrong because it’s so obvious). One obvious solution to this is to give the option of specifying the full path so that /usr/lib/postfix/sbin/master can be differentiated from all the other programs named master.

The next issue is processes that may run on behalf of multiple users. With sshd there is a single process to accept new connections running as root and a process running under the UID of each logged in user. So the number of sshd processes running as root will be one greater than the number of root login sessions. This means that if a sysadmin logs in directly as root via ssh (which is controversial and not the topic of this post – merely something that people do which I have to support) and the master process then crashes (or the sysadmin stops it either accidentally or deliberately) there won’t be an alert about the missing process. Of course the correct thing to do is to have a monitor talk to port 22 and look for the string “SSH-2.0-OpenSSH_”. Sometimes there are multiple instances of a daemon running under different UIDs that need to be monitored separately. So obviously we need the ability to monitor processes by UID.

In many cases process monitoring can be replaced by monitoring of service ports. So if something is listening on port 25 then it probably means that the Postfix “master” process is running regardless of what other “master” processes there are. But for my use I find it handy to have multiple monitors, if I get a Jabber message about being unable to send mail to a server immediately followed by a Jabber message from that server saying that “master” isn’t running I don’t need to fully wake up to know where the problem is.

SE Linux

One feature that I want is monitoring SE Linux contexts of processes in the same way as monitoring UIDs. While I’m not interested in writing tests for other security systems I would be happy to include code that other people write. So whatever I do I want to make it flexible enough to work with multiple security systems.

Transient Processes

Most daemons have a second process of the same name running during the startup process. This means if you monitor for exactly 1 instance of a process you may get an alert about 2 processes running when “logrotate” or something similar restarts the daemon. Also you may get an alert about 0 instances if the check happens to run at exactly the wrong time during the restart. My current way of dealing with this on my servers is to not alert until the second failure event with the “alertafter 2” directive. The “failure_interval” directive allows specifying the time between checks when the monitor is in a failed state, setting that to a low value means that waiting for a second failure result doesn’t delay the notification much.

To deal with this I’ve been thinking of making the ps.monitor script automatically check again after a specified delay. I think that solving the problem with a single parameter to the monitor script is better than using 2 configuration directives to mon to work around it.


Mon currently has a loadavg.monitor script that to check the load average. But that won’t catch the case of a single process using too much CPU time but not enough to raise the system load average. Also it won’t catch the case of a CPU hungry process going quiet (EG when the SETI at Home server goes down) while another process goes into an infinite loop. One way of addressing this would be to have the ps.monitor script have yet another configuration option to monitor CPU use, but this might get confusing. Another option would be to have a separate script that alerts on any process that uses more than a specified percentage of CPU time over it’s lifetime or over the last few seconds unless it’s in a whitelist of processes and users who are exempt from such checks. Probably every regular user would be exempt from such checks because you never know when they will run a file compression program. Also there is a short list of daemons that are excluded (like BOINC) and system processes (like gzip which is run from several cron jobs).

Monitoring for Exclusion

A common programming mistake is to call setuid() before setgid() which means that the program doesn’t have permission to call setgid(). If return codes aren’t checked (and people who make such rookie mistakes tend not to check return codes) then the process keeps elevated permissions. Checking for processes running as GID 0 but not UID 0 would be handy. As an aside a quick examination of a Debian/Testing workstation didn’t show any obvious way that a process with GID 0 could gain elevated privileges, but that could change with one chmod 770 command.

On a SE Linux system there should be only one process running with the domain init_t. Currently that doesn’t happen in Stretch systems running daemons such as mysqld and tor due to policy not matching the recent functionality of systemd as requested by daemon service files. Such issues will keep occurring so we need automated tests for them.

Automated tests for configuration errors that might impact system security is a bigger issue, I’ll probably write a separate blog post about it.

Converting Mbox to Maildir

MBox is the original and ancient format for storing mail on Unix systems, it consists of a single file per user under /var/spool/mail that has messages concatenated. Obviously performance is very poor when deleting messages from a large mail store as the entire file has to be rewritten. Maildir was invented for Qmail by Dan Bernstein and has a single message per file giving fast deletes among other performance benefits. An ongoing issue over the last 20 years has been converting Mbox systems to Maildir. The various ways of getting IMAP to work with Mbox only made this more complex.

The Dovecot Wiki has a good page about converting Mbox to Maildir [1]. If you want to keep the same message UIDs and the same path separation characters then it will be a complex task. But if you just want to copy a small number of Mbox accounts to an existing server then it’s a bit simpler.

Dovecot has a script to convert folders [2].

cd /var/spool/mail
mkdir -p /mailstore/
for U in * ; do
  ~/ -s $(pwd)/$U -d /mailstore/$U

To convert the inboxes shell code like the above is needed. If the users don’t have IMAP folders (EG they are just POP users or use local Unix MUAs) then that’s all you need to do.

cd /home
for DIR in */mail ; do
  U=$(echo $DIR| cut -f1 -d/)
  cd /home/$DIR
  for FOLDER in * ; do
    ~/ -s $(pwd)/$FOLDER -d /mailstore/$U/.$FOLDER
  cp .subscriptions /mailstore/$U/ subscriptions

Some shell code like the above will convert the IMAP folders to Maildir format. The end result is that the users will have to download all the mail again as their MUA will think that every message had been deleted and replaced. But as all servers with significant amounts of mail or important mail were probably converted to Maildir a decade ago this shouldn’t be a problem.

Observing Reliability

Last year I wrote about how great my latest Thinkpad is [1] in response to a discussion about whether a Thinkpad is still the “Rolls Royce” of laptops.

It was a few months after writing that post that I realised that I omitted an important point. After I had that laptop for about a year the DVD drive broke and made annoying clicking sounds all the time in addition to not working. I removed the DVD drive and the result was that the laptop was lighter and used less power without missing any feature that I desired. As I had installed Debian on that laptop by copying the hard drive from my previous laptop I had never used the DVD drive for any purpose. After a while I got used to my laptop being like that and the gaping hole in the side of the laptop where the DVD drive used to be didn’t even register to me. I would prefer it if Lenovo sold Thinkpads in the T series without DVD drives, but it seems that only the laptops with tiny screens are designed to lack DVD drives.

For my use of laptops this doesn’t change the conclusion of my previous post. Now the T420 has been in service for almost 4 years which makes the cost of ownership about $75 per year. $1.50 per week as a tax deductible business expense is very cheap for such a nice laptop. About a year ago I installed a SSD in that laptop, it cost me about $250 from memory and made it significantly faster while also reducing heat problems. The depreciation on the SSD about doubles the cost of ownership of the laptop, but it’s still cheaper than a mobile phone and thus not in the category of things that are expected to last for a long time – while also giving longer service than phones usually do.

One thing that’s interesting to consider is the fact that I forgot about the broken DVD drive when writing about this. I guess every review has an unspoken caveat of “this works well for me but might suck badly for your use case”. But I wonder how many other things that are noteworthy I’m forgetting to put in reviews because they just don’t impact my use. I don’t think that I am unusual in this regard, so reading multiple reviews is the sensible thing to do.

QEMU for ARM Processes

I’m currently doing some embedded work on ARM systems. Having a virtual ARM environment is of course helpful. For the i586 class embedded systems that I run it’s very easy to setup a virtual environment, I just have a chroot run from systemd-nspawn with the --personality=x86 option. I run it on my laptop for my own development and on a server my client owns so that they can deal with the “hit by a bus” scenario. I also occasionally run KVM virtual machines to test the boot image of i586 embedded systems (they use GRUB etc and are just like any other 32bit Intel system).

ARM systems have a different boot setup, there is a uBoot loader that is fairly tightly coupled with the kernel. ARM systems also tend to have more unusual hardware choices. While the i586 embedded systems I support turned out to work well with standard Debian kernels (even though the reference OS for the hardware has a custom kernel) the ARM systems need a special kernel. I spent a reasonable amount of time playing with QEMU and was unable to make it boot from a uBoot ARM image. The Google searches I performed didn’t turn up anything that helped me. If anyone has good references for getting QEMU to work for an ARM system image on an AMD64 platform then please let me know in the comments. While I am currently surviving without that facility it would be a handy thing to have if it was relatively easy to do (my client isn’t going to pay me to spend a week working on this and I’m not inclined to devote that much of my hobby time to it).

QEMU for Process Emulation

I’ve given up on emulating an entire system and now I’m using a chroot environment with systemd-nspawn.

The package qemu-user-static has staticly linked programs for emulating various CPUs on a per-process basis. You can run this as “/usr/bin/qemu-arm-static ./staticly-linked-arm-program“. The Debian package qemu-user-static uses the binfmt_misc support in the kernel to automatically run /usr/bin/qemu-arm-static when an ARM binary is executed. So if you have copied the image of an ARM system to /chroot/arm you can run the following commands like the following to enter the chroot:

cp /usr/bin/qemu-arm-static /chroot/arm/usr/bin/qemu-arm-static
chroot /chroot/arm bin/bash

Then you can create a full virtual environment with “/usr/bin/systemd-nspawn -D /chroot/arm” if you have systemd-container installed.

Selecting the CPU Type

There is a huge range of ARM CPUs with different capabilities. How this compares to the range of x86 and AMD64 CPUs depends on how you are counting (the i5 system I’m using now has 76 CPU capability flags). The default CPU type for qemu-arm-static is armv7l and I need to emulate a system with a armv5tejl. Setting the environment variable QEMU_CPU=pxa250 gives me armv5tel emulation.

The ARM Architecture Wikipedia page [2] says that in armv5tejl the T stands for Thumb instructions (which I don’t think Debian uses), the E stands for DSP enhancements (which probably isn’t relevant for me as I’m only doing integer maths), the J stands for supporting special Java instructions (which I definitely don’t need) and I’m still trying to work out what L means (comments appreciated).

So it seems clear that the armv5tel emulation provided by QEMU_CPU=pxa250 will do everything I need for building and testing ARM embedded software. The issue is how to enable it. For a user shell I can just put export QEMU_CPU=pxa250 in .login or something, but I want to emulate an entire system (cron jobs, ssh logins, etc).

I’ve filed Debian bug #870329 requesting a configuration file for this [1]. If I put such a configuration file in the chroot everything would work as desired.

To get things working in the meantime I wrote the below wrapper for /usr/bin/qemu-arm-static that calls /usr/bin/qemu-arm-static.orig (the renamed version of the original program). It’s ugly (I would use a config file if I needed to support more than one type of CPU) but it works.

#include <stdio.h>
#include <unistd.h>
#include <stdlib.h>

int main(int argc, char **argv)
  if(setenv("QEMU_CPU", "pxa250", 1))
    printf("Can't set $QEMU_CPU\n");
    return 1;
  execv("/usr/bin/qemu-arm-static.orig", argv);
  printf("Can't execute \"%s\" because of qemu failure\n", argv[0]);
  return 1;

Running a Tor Relay

I previously wrote about running my SE Linux Play Machine over Tor [1] which involved configuring ssh to use Tor.

Since then I have installed a Tor hidden service for ssh on many systems I run for clients. The reason is that it is fairly common for them to allow a server to get a new IP address by DHCP or accidentally set their firewall to deny inbound connections. Without some sort of VPN this results in difficult phone calls talking non-technical people through the process of setting up a tunnel or discovering an IP address. While I can run my own VPN for them I don’t want their infrastructure tied to mine and they don’t want to pay for a 3rd party VPN service. Tor provides a free VPN service and works really well for this purpose.

As I believe in giving back to the community I decided to run my own Tor relay. I have no plans to ever run a Tor Exit Node because that involves more legal problems than I am willing or able to deal with. A good overview of how Tor works is the EFF page about it [2]. The main point of a “Middle Relay” (or just “Relay”) is that it only sends and receives encrypted data from other systems. As the Relay software (and the sysadmin if they choose to examine traffic) only sees encrypted data without any knowledge of the source or final destination the legal risk is negligible.

Running a Tor relay is quite easy to do. The Tor project has a document on running relays [3], which basically involves changing 4 lines in the torrc file and restarting Tor.

If you are running on Debian you should install the package tor-geoipdb to allow Tor to determine where connections come from (and to not whinge in the log files).

ORPort [IPV6ADDR]:9001

If you want to use IPv6 then you need a line like the above with IPV6ADDR replaced by the address you want to use. Currently Tor only supports IPv6 for connections between Tor servers and only for the data transfer not the directory services.

Data Transfer

I currently have 2 systems running as Tor relays, both of them are well connected in a European DC and they are each transferring about 10GB of data per day which isn’t a lot by server standards. I don’t know if there is a sufficient number of relays around the world that the share of the load is small or if there is some geographic dispersion algorithm which determined that there are too many relays in operation in that region.

Apache Mesos on Debian

I decided to try packaging Mesos for Debian/Stretch. I had a spare system with a i7-930 CPU, 48G of RAM, and SSDs to use for building. The i7-930 isn’t really fast by today’s standards, but 48G of RAM and SSD storage mean that overall it’s a decent build system – faster than most systems I run (for myself and for clients) and probably faster than most systems used by Debian Developers for build purposes.

There’s a github issue about the lack of an upstream package for Debian/Stretch [1]. That upstream issue could probably be worked around by adding Jessie sources to the APT sources.list file, but a package for Stretch is what is needed anyway.

Here is the documentation on building for Debian [2]. The list of packages it gives as build dependencies is incomplete, it also needs zlib1g-dev libapr1-dev libcurl4-nss-dev openjdk-8-jdk maven libsasl2-dev libsvn-dev. So BUILDING this software requires Java + Maven, Ruby, and Python along with autoconf, libtool, and all the usual Unix build tools. It also requires the FPM (Fucking Package Management) tool, I take the choice of name as an indication of the professionalism of the author.

Building the software on my i7 system took 79 minutes which includes 76 minutes of CPU time (I didn’t use the -j option to make). At the end of the build it turned out that I had mistakenly failed to install the Fucking Package Management “gem” and it aborted. At this stage I gave up on Mesos, the pain involved exceeds my interest in trying it out.

How to do it Better

One of the aims of Free Software is that bugs are more likely to get solved if many people look at them. There aren’t many people who will devote 76 minutes of CPU time on a moderately fast system to investigate a single bug. To deal with this software should be prepared as components. An example of this is the SE Linux project which has 13 source modules in the latest release [3]. Of those 13 only 5 are really required. So anyone who wants to start on SE Linux from source (without considering a distribution like Debian or Fedora that has it packaged) can build the 5 most important ones. Also anyone who has an issue with SE Linux on their system can find the one source package that is relevant and study it with a short compile time. As an aside I’ve been working on SE Linux since long before it was split into so many separate source packages and know the code well, but I still find the separation convenient – I rarely need to work on more than a small subset of the code at one time.

The requirement of Java, Ruby, and Python to build Mesos could be partly due to language interfaces to call Mesos interfaces from Ruby and Python. Ohe solution to that is to have the C libraries and header files to call Mesos and have separate packages that depend on those libraries and headers to provide the bindings for other languages. Another solution is to have autoconf detect that some languages aren’t installed and just not try to compile bindings for them (this is one of the purposes of autoconf).

The use of a tool like Fucking Package Management means that you don’t get help from experts in the various distributions in making better packages. When there is a FOSS project with a debian subdirectory that makes barely functional packages then you will be likely to have an experienced Debian Developer offer a patch to improve it (I’ve offered patches for such things on many occasions). When there is a FOSS project that uses a tool that is never used by Debian developers (or developers of Fedora and other distributions) then the only patches you will get will be from inexperienced people.

A software build process should not download anything from the Internet. The source archive should contain everything that is needed and there should be dependencies for external software. Any downloads from the Internet need to be protected from MITM attacks which means that a responsible software developer has to read through the build system and make sure that appropriate PGP signature checks etc are performed. It could be that the files that the Mesos build downloaded from the Apache site had appropriate PGP checks performed – but it would take me extra time and effort to verify this and I can’t distribute software without being sure of this. Also reproducible builds are one of the latest things we aim for in the Debian project, this means we can’t just download files from web sites because the next build might get a different version.

Finally the fpm (Fucking Package Management) tool is a Ruby Gem that has to be installed with the “gem install” command. Any time you specify a gem install command you should include the -v option to ensure that everyone is using the same version of that gem, otherwise there is no guarantee that people who follow your documentation will get the same results. Also a quick Google search didn’t indicate whether gem install checks PGP keys or verifies data integrity in other ways. If I’m going to compile software for other people to use I’m concerned about getting unexpected results with such things. A Google search indicates that Ruby people were worried about such things in 2013 but doesn’t indicate whether they solved the problem properly.

Forking Mon and DKIM with Mailing Lists

I have forked the “Mon” network/server monitoring system. Here is a link to the new project page [1]. There hasn’t been an upstream release since 2010 and I think we need more frequent releases than that. I plan to merge as many useful monitoring scripts as possible and support them well. All Perl scripts will use strict and use other best practices.

The first release of etbe-mon is essentially the same as the last release of the mon package in Debian. This is because I started work on the Debian package (almost all the systems I want to monitor run Debian) and as I had been accepted as a co-maintainer of the Debian package I put all my patches into Debian.

It’s probably not a common practice for someone to fork upstream of a package soon after becoming a comaintainer of the Debian package. But I believe that this is in the best interests of the users. I presume that there are other collections of patches out there and I hope to merge them so that everyone can get the benefits of features and bug fixes that have been separate due to a lack of upstream releases.

Last time I checked mon wasn’t in Fedora. I believe that mon has some unique features for simple monitoring that would be of benefit to Fedora users and would like to work with anyone who wants to maintain the package for Fedora. I am also interested in working with any other distributions of Linux and with non-Linux systems.

While setting up the mailing list for etbemon I wrote an article about DKIM and mailing lists (primarily Mailman) [2]. This explains how to setup Mailman for correct operation with DKIM and also why that seems to be the only viable option.

More KVM Modules Configuration

Last year I blogged about blacklisting a video driver so that KVM virtual machines didn’t go into graphics mode [1]. Now I’ve been working on some other things to make virtual machines run better.

I use the same initramfs for the physical hardware as for the virtual machines. So I need to remove modules that are needed for booting the physical hardware from the VMs as well as other modules that get dragged in by systemd and other things. One significant saving from this is that I use BTRFS for the physical machine and the BTRFS driver takes 1M of RAM!

The first thing I did to reduce the number of modules was to edit /etc/initramfs-tools/initramfs.conf and change “MODULES=most” to “MODULES=dep”. This significantly reduced the number of modules loaded and also stopped the initramfs from probing for a non-existant floppy drive which added about 20 seconds to the boot. Note that this will result in your initramfs not supporting different hardware. So if you plan to take a hard drive out of your desktop PC and install it in another PC this could be bad for you, but for servers it’s OK as that sort of upgrade is uncommon for servers and only done with some planning (such as creating an initramfs just for the migration).

I put the following rmmod commands in /etc/rc.local to remove modules that are automatically loaded:
rmmod btrfs
rmmod evdev
rmmod lrw
rmmod glue_helper
rmmod ablk_helper
rmmod aes_x86_64
rmmod ecb
rmmod xor
rmmod raid6_pq
rmmod cryptd
rmmod gf128mul
rmmod ata_generic
rmmod ata_piix
rmmod i2c_piix4
rmmod libata
rmmod scsi_mod

In /etc/modprobe.d/blacklist.conf I have the following lines to stop drivers being loaded. The first line is to stop the video mode being set and the rest are just to save space. One thing that inspired me to do this is that the parallel port driver gave a kernel error when it loaded and tried to access non-existant hardware.
blacklist bochs_drm
blacklist joydev
blacklist ppdev
blacklist sg
blacklist psmouse
blacklist pcspkr
blacklist sr_mod
blacklist acpi_cpufreq
blacklist cdrom
blacklist tpm
blacklist tpm_tis
blacklist floppy
blacklist parport_pc
blacklist serio_raw
blacklist button

On the physical machine I have the following in /etc/modprobe.d/blacklist.conf. Most of this is to prevent loading of filesystem drivers when making an initramfs. I do this because I know there’s never going to be any need for CDs, parallel devices, graphics, or strange block devices in a server room. I wouldn’t do any of this for a desktop workstation or laptop.
blacklist ppdev
blacklist parport_pc
blacklist cdrom
blacklist sr_mod
blacklist nouveau

blacklist ufs
blacklist qnx4
blacklist hfsplus
blacklist hfs
blacklist minix
blacklist ntfs
blacklist jfs
blacklist xfs

SE Linux in Debian/Stretch

Debian/Stretch has been frozen. Before the freeze I got almost all the bugs in policy fixed, both bugs reported in the Debian BTS and bugs that I know about. This is going to be one of the best Debian releases for SE Linux ever.

Systemd with SE Linux is working nicely. The support isn’t as good as I would like, there is still work to be done for systemd-nspawn. But it’s close enough that anyone who needs to use it can use audit2allow to generate the extra rules needed. Systemd-nspawn is not used by default and it’s not something that a new Linux user is going to use, I think that expert users who are capable of using such features are capable of doing the extra work to get them going.

In terms of systemd-nspawn and some other rough edges, the issue is the difference between writing policy for a single system vs writing policy that works for everyone. If you write policy for your own system you can allow access for a corner case without a lot of effort. But if I wrote policy to allow access for every corner case then they might add up to a combination that can be exploited. I don’t recommend blindly adding the output of audit2allow to your local policy (be particularly wary of access to shadow_t and write access to etc_t, lib_t, etc). But OTOH if you have a system that’s running in enforcing mode that happens to have one daemon with more access than is ideal then all the other daemons will still be restricted.

As for previous releases I plan to keep releasing updates to policy packages in my own apt repository. I’m also considering releasing policy source to updates that can be applied on existing Stretch systems. So if you want to run the official Debian packages but need updates that came after Stretch then you can get them. Suggestions on how to distribute such policy source are welcome.

Please enjoy SE Linux on Stretch. It’s too late for most bug reports regarding Stretch as most of them won’t be sufficiently important to justify a Stretch update. The vast majority of SE Linux policy bugs are issues of denying wanted access not permitting unwanted access (so not a security issue) and can be easily fixed by local configuration, so it’s really difficult to make a case for an update to Stable. But feel free to send bug reports for Buster (Stretch+1).

Video Mode and KVM

I recently changed my KVM servers to use the kernel command-line parameter nomodeset for the virtual machine kernels so that they don’t try to go into graphics mode. I do this because I don’t have X11 or VNC enabled and I want a text console to use with the -curses option of KVM. Without the nomodeset KVM just says that it’s in 1024*768 graphics mode and doesn’t display the text.

Now my KVM server running Debian/Unstable has had it’s virtual machines start going into graphics mode in spite of nomodeset parameter. It seems that an update to QEMU has added a new virtual display driver which recent kernels from Debian/Unstable support with the bochs_drm driver, and that driver apparently doesn’t respect nomodeset.

The solution is to create a file named /etc/modprobe.d/blacklist.conf with the contents “blacklist bochs_drm” and now my virtual machines have a usable plain-text console again! This blacklist method works for all video drivers, you can blacklist similar modules for the other virtual display hardware. But it would be nice if the one kernel option would cover them all.