5

Security Flaws in Free Software

I just wrote about the system administration issues related to the recent Debian SSL/SSH security flaw [1]. The next thing we need to consider is how we can change things to reduce the incidence of such problems.

The problem we just had was due to the most important part of the entropy supply for the random number generator not being used due to a mistake in commenting out some code. The only entropy that was used was the PID number of the process which uses the SSL library code which gives us 15 bits of entropy. It seems to me that if we had zero bits of entropy the problem would have been discovered a lot sooner (almost certainly before the code was released in a stable version). Therefore it seems that using a second-rate source of entropy (which was never required) masked the problem that the primary source of entropy was not working. Would it make sense to have a practice of not using such second-rate sources of entropy to reduce the risk of such problems being undetected for any length of time? Is this a general issue or just a corner case?

Joss makes some suggestions for process improvements [2]. He suggests that having a single .diff.gz file (the traditional method for maintaining Debian packages) that directly contains all patches can obscure some patches. The other extreme is when you have a patch management system with several dozen small patches and the reader has to try and discover what each of them does. For an example of this see the 43 patches which are included in the Debian PAM package for Etch, also note that the PAM system is comprised of many separate shared objects (modules), this means that the patching system lends itself to having one patch per module and thus 43 patches for PAM isn’t as difficult to manage as 43 patches for a complex package which is not comprised of multiple separate components might be. That said I think that there is some potential for separating out patches. Having a distinction between different types of patches might help. For example we could have a patch for Makefiles etc (including autoconf etc), a patch for adding features, and a patch for fixing bugs. Then people reviewing the source for potential bugs could pay a lot of attention to bug fixes, a moderate amount of attention to new features, and casually skim the Makefile stuff.

The problem began with this mailing list discussion [3]. Kurt’s first message starts with “When debbuging applications” and ends with “What do you people think about removing those 2 lines of code?“. The reply he received from Ulf (a member of the OpenSSL development team) is “If it helps with debugging, I’m in favor of removing them“. It seems to me that there might have been a miscommunication there, Ulf may have believed that the discussion only concerned a debugging built and not a build that would eventually end up on millions of machines.

It seems possible that the reaction would have been different if Kurt had mentioned that he wanted to have a single source tree for both debugging and for regular use. It also seems likely that his proposed change may have received more inspection if he had clearly stated that he was doing to include it in Debian where it would be used by millions of people. When I am doing Debian development I generally don’t mention all the time “this code will be used by millions of people so it’s important that we get it right“, although I do sometimes make such statements if I feel that my questions are not getting the amount of consideration from upstream that a binary package destined for use by millions of people deserves. Maybe it would be a good practice to clarify such things in the case of important packages. For a package that is relevant to the security of the entire distribution (and possibly to other machines around the net – as demonstrated in this case) it doesn’t seem unreasonable to include a post-script mentioning the scope of the code use (it could be done with an alternate SIG if using a MUA that allows selecting from multiple SIGs in a convenient manner).

In the response from the OpenSSL upstream [4] it is claimed that the mailing list used was not the correct one. Branden points out that the openssl-team mailing list address seems not to be documented anywhere [5]. One thing to be learned from this is that distribution developers need to be proactive in making contact with upstream developers. You might think that building packages for a major distribution and asking questions about it on the mailing list would result in someone from the team noticing and mentioning any other things that you might need to do. But maybe it would make sense to send private mail to one of the core developers, introduce yourself, and ask for advice on the best way to manage communication to avoid this type of confusion.

I think that it is ideal for distribution developers to have the phone numbers of some of the upstream developers. If the upstream work is sponsored by the employer of one of the upstream developers then it seems reasonable to ask for their office phone number. Sometimes it’s easier to sort things out by phone than by email.

Gunnar Wolf describes how the way this bug was discovered and handled shows that the Debian processes work [6]. A similar bug in proprietary software would probably not be discovered nearly as quickly and would almost certainly not be fixed in such a responsible manner.

Update: According to the OpenSSL project about page [7], Ulf is actually not a “core” member, just a team member. I had used the term “core” in a slang manner based on the fact that Ulf has an official @openssl.org email address.

14

Debian SSH Problems

It has recently been announced that Debian had a serious bug in the OpenSSL code [1], the most visible affect of this is compromising SSH keys – but it can also affect VPN and HTTPS keys. Erich Schubert was one of the first people to point out the true horror of the problem, only 2^15 different keys can be created [2]. It should not be difficult for an attacker to generate 2^15 host keys to try all combinations for decrypting a login session. It should also be possible to make up to 2^15 attempts to login to a session remotely if an attacker believes that an authorized key was being used – that would take less than an hour at a rate of 10 attempts per second (which is possible with modern net connections) and could be done in a day if the server was connected to the net by a modem.

John Goerzen has some insightful thoughts about the issue [3]. I recommend reading his post. One point he makes is that the person who made the mistake in question should not be lynched. One thing I think we should keep in mind is the fact that people tend to be more careful after they have made mistakes, I expect that anyone who makes a mistake in such a public way which impacts so many people will be very careful for a long time…

Steinar H. Gunderson analyses the maths in relation to DSA keys, it seems that if a DSA key is ever used with a bad RNG then it can be cracked by someone who sniffs the network [4]. It seems that it is safest to just not use DSA to avoid this risk. Another issue is that if a client supports multiple host keys (ssh version 2 can use three different key types, one for the ssh1 protocol, one for ssh2 with RSA, and one for ssh2 with DSA) then a man in the middle attack can be implemented by forcing a client to use a different key type – see Stealth’s article in Phrack for the details [5]. So it seems that we should remove support for anything other than SSHv2 with RSA keys.

To remove such support from the ssh server edit /etc/ssh/sshd_config and make sure it has a line with “Protocol 2“, and that the only HostKey line references an RSA key. To remove it from the ssh client (the important thing) edit /etc/ssh/ssh_config and make sure that it has something like the following:

Host *
Protocol 2
HostKeyAlgorithms ssh-rsa
ForwardX11 no
ForwardX11Trusted no

You can override this for different machines. So if you have a machine that uses DSA only then it would be easy to add a section:

Host strange-machine
Protocol 2
HostKeyAlgorithms ssh-dsa

So making the default configuration of the ssh client on all machines you manage has the potential to dramatically reduce the incidence of MITM attacks from the less knowledgable users.

When skilled users who do not have root access need to change things they can always edit the file ~/.ssh/config (which has the same syntax as /etc/ssh/ssh_config) or they can use command-line options to override it. The command ssh -o “HostKeyAlgorithms ssh-dsa” user@server will force the use of DSA encryption even if the configuration file requests RSA.

Enrico Zini describes how to use ssh-keygen to get the fingerprint of the host key [6]. One thing I have learned from comments on this post is how to get a fingerprint from a known hosts file. A common situation is that machine A has a known hosts file with an entry for machine B. I want to get the right key in machine C and there is no way of directly communicating between machine A and machine C (EG they are in different locations with no network access). In that situation the command “ssh-keygen -l -f ~/.ssh/known_hosts” can be used to display all the fingerprints of hosts that you have connected to in the past, then it’s a simple matter of grepping the output.

Docunext has an interesting post about ways of mitigating such problems [7]. One thing that they suggest is using fail2ban to block IP addresses that appear to be trying to do brute-force attacks. It’s unfortunate that the version of fail2ban in Debian uses /tmp/fail2ban.sock for it’s Unix domain socket for talking to the server (the version in Unstable uses /var/run/fail2ban/fail2ban.sock). They also mention patching network drivers to add entropy to the kernel random number generator. One thing that seems interesting is the package randomsound (currently in Debian/Unstable) which takes ALSA sound input as a source of entropy, note that you don’t need to have any sound input device connected.

When considering fail2ban and similar things, it’s probably best to start by restricting the number of machines which can connect to your SSH server. Firstly if you put it on a non-default port then it’ll take some brute-force to find it. This will waste some of the attacker’s time and also make the less persistent attackers go elsewhere. One thing that I am considering is having a few unused ports configured such that any IP address which connects to them gets added to my NetFilter configuration – if you connect to such ports then you can’t connect to any other ports for a week (or until the list becomes too full). So if for example I had port N configured in such a manner and port N+100 used for ssh listening then it’s likely that someone who port-scans my server would be blocked before they even discovered the SSH server. Does anyone know of free software to do this?

The next thing to consider is which IP addresses may connect. If you were to allow all the IP addresses from all the major ISPs in your country to connect to your server then it would still be a small fraction of the IP address space. Sure attackers could use machines that they already cracked in your country to launch their attacks, but they would have to guess that you had such a defense in place, and even so it would be an inconvenience for them. You don’t necessarily need to have a perfect defense, you only need to make the effort to reward ratio be worse for attacking you than for attacking someone else. Note that I am not advocating taking a minimalist approach to security, merely noting that even a small increment in the strength of your defenses can make a significant difference to the risk you face.

Update: based on comments I’m now considering knockd to open ports on demand. The upstream site for knockd is here [8], and some documentation on setting it up in Debian is here [9]. The concept of knockd is that you make connections to a series of ports which act as a password for changing the firewall rules. An attacker who doesn’t know those port numbers won’t be able to connect. Of course anyone who can sniff your network will discover the ports soon enough, but I guess you can always login and change the port numbers once knockd has let you in.

Also thanks to Helmut for advice on ssh-keygen.

Trust and My SE Linux Play Machine

Currently my SE Linux Play Machine [1] is running as a Xen DomU. So if someone cracks it they would also have to crack Xen to get access to directly change things on the hardware (EG modifying the boot process). As documented in my last post [2] a user of my Play Machine recently managed to change my password. Of course this was just two days after the vmsplice() kernel security flaw had been discovered [3]. Of course any machine that offers shell access to remote users (or the ability to run CGI-BIN scripts or other programs that users can upload) is immediately vulnerable to such exploits and while SE Linux has blocked local kernel exploits in the past [4] there will always be the possibility of kernel exploits that SE Linux can’t block or which can be re-written to work in a way that is not stopped by the SE Linux policy. So it’s best to assume that SE Linux systems are vulnerable to kernel exploits.

At the time that the vmsplice() exploit was announced there was a claim that it could be used to de-stabilise a Xen Dom0 when run within a DomU. It’s best to assume that any attack which can make some software perform in an unexpected manner can also be used to successfully attack it. So at the time I was working on the assumption that the Dom0 could have been exploited.

Therefore I reinstalled the entire machine, I firstly installed a new Dom0 (on which I decided to run Debian/Unstable) and then I made a fresh install of Etch for the Play Machine. There is a possibility that an attacker could compromise the hardware (changing the BIOS or other similar attacks), but this seems unlikely – I doubt that someone would go to such effort to attach hardware that I use for demonstrating SE Linux and for SE Linux development (it has no data which is secret).

If someone attacks my Play Machine they would have to first get root on the DomU in question and then crack Xen to get access to the hardware. Then the machine is on a separate Ethernet segment which has less access to my internal network than the general Internet does (so they would not gain any real benefit).

One thing an attacker can do is launch a DOS attack on my machine. One summer a Play Machine overheated and died, I suspect that the extra heat produced by a DOS attack contributed to that problem. But losing a low-end machine I bought second-hand is not a big deal.

When discussing the machine there are two common comments I get. One is a suggestion that I am putting myself at risk, I think that the risk of visiting random web sites is significantly greater. Another is a challenge to put the machine on my internal network if I really trust SE Linux, as noted I have made mistakes in the past and there have been Linux kernel bugs – but apart from that it’s always best to have multiple layers of protection.

3

SE Linux Play Machine and Passwords

My SE Linux Play Machine [1] has been online again since the 18th of March.

On Monday the 11th of Feb I took it offline after a user managed to change the password for my own account (their comment was “ohls -lsa! i can change passwordls -lsals -lsa HACKED!“). Part of the problem was the way /bin/passwd determines whether it should change a password.

The previous algorithm (and the one that is currently used in Debian/Etch) is that if the UID of the account that is having it’s password changed doesn’t match the UID of the process that ran /bin/passwd then an additional SE Linux check is performed (to see if it has permission to change other user’s passwords). The problem here is that my Play machine has root (UID==0) as the guest account, and that according to the /bin/passwd program there is no difference between the root account (for unprivileged users) and the bofh account (which I use and which also has UID==0). This means of course that users of the root account could change the password of my account. My solution to this was to run chcon on the /bin/passwd program to give it a context that denied it the ability to change a password. The problem was that I accidentally ran the SE Linux program restorecon (which restores file contexts to their default values) which allowed /bin/passwd to change passwords, and therefore allowed a user to change the password of my account.

The semanage tool that allows changing the default value of a file context does not permit changing the default for a file specification that matches one from the system policy (so the sys-admin can’t override compiled in values).

I have now fixed the problem (the fix is in my Etch SE Linux repository [2] and has been accepted for Debian/Unstable and something based on it will go into the upstream branch of Shadow. See the Debian bug report #472575 [3] for more information.

The summary of the new code is that in any case where a password is not required to change the user’s password then SE Linux access checks will be performed. The long version is below:

The new algorithm (mostly taken from the Red Hat code base which was written by Dan Walsh) is that you can only change a password if you are running as non-root (which means that the pam_unix.so code will have verified the current password) or if you are running as root and the previous SE Linux security context of the process is permitted access to perform the passwd operation in the passwd class (which means it is permitted to change other user’s passwords).

The previous context (the context before one of the exec family of system calls was called) is used for such access checks because we want to determine if the user’s shell (or other program used to launch /bin/passwd) was permitted to change other user’s passwords – executing a privileged program such as /bin/passwd causes a domain transition and the context is different) than the program that was used to execute it. It’s much like a SETUID program calling getuid(2) to get the UID of the process which launched it.

To get the desired functionality for my Play Machine I don’t want a user to change their own password as the account is shared. So I appended password requisite pam_deny.so to the file /etc/pam.d/passwd (as well as the chfn and chsh commands) so that hostile users can’t break things. The new code in /bin/passwd will prevent users from taking over the machine if my PAM configuration ever gets broken, having multiple layers of protection is always a good thing.

The end result is that the Debian package and the upstream code base are improved, and my Debian Etch repository has the code in question.

SE Linux Etch Repository for AMD64

My Etch back-port repository of SE Linux related packages (which I documented in a previous post [1]) now has a complete set of packages for AMD64. From now on I aim to make AMD64 and i386 be my main supported platforms for SE Linux development.

There is a guy who may be able to give me a stack of well configured PowerMacs (2gigs of RAM), if he comes through with that then I may add PPC-32 to the list of architectures I support. If that happens then probably the machines will have their hard drives smashed for security reasons, so I’ll want to swap some G3 PowerMacs for hard drives.

1

Debian SE Linux Status

At the moment I’ve got more time to work on these things than I have had for a while.

I’ve got Etch support going quite well (see my post about my Etch repository [1]), the next step is to back-port some packages for AMD64 to get it working as well as i386.

I’ve got an i386 Xen server for SE Linux development (which is also used for my Play Machine’s [2] DomU – so it’s definitely not for anything secret). I can give accounts and/or DomU’s to people who have a good use for them (the machine has 512M of RAM so could have 4-5 DomU’s).

Currently it seems that the 2.6.24 kernel in Debian doesn’t work for Xen (at least on with an i686 CPU). I have filed bug report #472584 about it not working as a DomU [3]. This combined with the fact that according to bug report #466492 it doesn’t work as a Dom0 (which I have verified in my own tests) [4] makes the package linux-image-2.6.24-1-xen-686 unusable.

Due to the inability to use 2.6.24 Xen I can’t do SE Linux development for Lenny in a DomU (Lenny tools build policy version 21 and the Etch kernel I’m using only supports policy version 20). So I have repurposed one of my servers for Lenny (unstable) development. I can give user accounts on that machine to anyone who has a good reason (and there are some people who I would give root access to if they need it).

The current policy packages in Unstable are built without MCS support. This is a problem as converting between a policy which has MCS or MLS and one which doesn’t is rather painful (purge policy, reinstall policy, and reboot are all required steps). I have filed bug report #473048 with a patch for this – my patch may not actually be much good (I don’t understand some aspects of Manoj’s code) but it does achieve the desired result [5]. I won’t be making Apt repositories for such things as I expect that the changes will get into Debian fast enough.

The next thing I am starting to work on is MLS support for Debian (currently it only supports the Strict and Targeted policies). See the Multilevel Security Wikipedia page for some background information on the technology [6].

I don’t expect that many people will use MLS on Debian in production environments, and it wouldn’t surprise me if no-one used it on a production server (although of course it would be impossible to prove this). But I still believe that it’s worth having for educational purposes. I am sure that there are packages in Debian of a similar size that will get less use so it’s not a waste of disk space on mirror servers!

The only real down-side to adding MLS support is that it will increase the build time for the Debian SE Linux policy packages, currently they take 13 minutes to build on a 1.1GHz Celeron system (the Xen server I mentioned previously) and I expect that the machine in question will have build times greater than 20 minutes with MLS included. I will probably need to set up an Unstable DomU on a dual-core 64bit machine for the sole purpose of building policy packages. I will also have to investigate use of the “-j” option to make when building the policy to take advantage of the dual cores. I often do small tweaks to policy and it’s annoying to have to wait for any length of time for a result.

The version of Coreutils that is currently in Unstable will have ls display a “+” character for every file when running SE Linux (I have filed bug report #472590) about this [7]. It is being actively discussed and at this stage it seems most likely that the functionality from Etch in this regard will be restored (which is using “+” to represent ACLs only not SE Linux contexts). It seems likely to me that I will find a few other issues of a similar nature now that I have started seriously working on Unstable.

For the benefit of Debian and upstream developers who get involved in such discussions, please do not be put off if you join a discussion that is CC’d to the NSA SE Linux mailing list and have your message rejected by the list server. The code of conduct is much the same on most mailing lists, and the SE Linux list is not much different to others. The difference is that before your get your email address white-listed for posting you have to agree to the terms of service for the list. The people who run the list server appear to work more than 40 hours a week so there should not be a great delay. If anyone wants to get a message about Debian SE Linux development sent to the list without delay on a weekend then they can send it to me for forwarding.

I am aware of some discussions about SE Linux and the Debian installer. I have not responded to them yet because I wanted to get some serious coding done first as an approach of “I haven’t done much coding recently but trust me I’ll fix the problems for you” might not be accepted well. I will start investigating these issues as soon as I have my Debian/Unstable server working well in enforcing mode.

Update: I’ve just filed bug report #473067 with a patch to enable MLS policy builds [8].

8

Chilled Memory Attacks

In 1996 Peter Gutmann wrote a paper titled “Secure Deletion of Data from Magnetic and Solid-State Memory” [1]. In that paper he mentions the fact that the contents of RAM last longer at lower temperatures and suggests that data could be retained for weeks at a temperature of -60C or lower (while 140C causes rapid data loss). The paper also addresses issues of data recovery from hard drives, but given that adequate CPU power for encryption is available recovering data from a disk shouldn’t be an issue unless thee attacker can get the key to decrypt it or crack the algorithm – so disk recovery is not a hot issue at the moment.

Recently some researchers at Princeton University have published a paper describing in detail how to chill RAM to make it keep its data after a power cycle and even after being installed in a different computer [2]. This attracted a lot of attention, while Peter’s paper described the theoretical concept (in great detail) the Princeton group showed how to implement the attack using materials that are commonly available.

Most of the analysis of this misses some fundamental points. Any suggestion that you can wipe the RAM on power failure or on boot misses the point entirely. If an attacker can chill a DIMM and then remove it from the system then there is no chance for it to be wiped. Maybe if you had security on the PC case to detect case opening (some servers have a basic version of this) such things would do a little good, but it shouldn’t be difficult to bypass in most cases.

Another common flawed analysis is to suggest that this is no big deal because sniffing the memory bus has been possible for years. While it has always been possible for government agencies and companies who design motherboards to sniff the bus, for most potential attackers it has been overly difficult.

When considering the effectiveness of a security system you should first consider what your threat model is. Who is going to attack you and what resources will they be willing and able to devote to the attack? An organisation that is prepared to use expensive equipment and highly trained people to break your encryption probably has other methods of gaining access to your secret data that are easier and cheaper.

The research from Princeton suggests that I could perform such attacks with my spare time and with equipment that is very cheap. I’ve been idly considering doing this to an old PC just for fun! Therefore I have to assume that everyone who has the same amount of skill and money as me can potentially compromise my data if they capture one of my machines.

It is still most likely that if anyone steals my laptop they will want to sell it and use the money to buy drugs. I don’t think that I have any data that is anywhere near valuable enough to justify a targeted mugging. But my procedures (in terms of changing passwords etc) in the case of my laptop being stolen now need to be scaled up due to the ease in which data might be compromised.

The best way of dealing with this would be to have the decryption keys locked inside the CPU (stored in registers or memory that’s locked in the CPU cache). The possibility of getting a modern CPU to operate at any temperature approaching -60C is laughable, and the CPU is a well contained package that can operate on its own and is difficult to attack. This would make things significantly more difficult for an attacker while requiring little effort (in fact it might be possible to lock data in the CPU cache already in which case a software change is all that is required).

Update: A comment by Mike made a good point about CPU cooling. Toms Hardware performed an overclocking experiment (from 3.2GHz to 5.25GHz) and used liquid nitrogen cooling [3]. It might be possible to cool a CPU core to -60C in a reasonably small amount of time. But I still believe that it would raise the bar enough to make it worth doing.

Update2: Thanks Jaime for the spelling advice.

2

Oracle Unbreakable Linux

Matt Bottrell writes about the Oracle Linux offerings presented at LCA 2008 [1]

The one thing that Oracle does which I really object to is the “unbreakable” part of their advertising. They have pictures of penguins in armour and the only reasonable assumption is that their system is more secure in some way. As far as I am aware they offer no security features other than those which are available in Red Hat Enterprise Linux, CentOS, and Fedora. The unbreakable claims were also made before Oracle even had their own Linux distribution, which gave them even less reason for the claims.

If someone is going to be given credit for making Linux unbreakable then the contributors list for the SE Linux project [2] is one possible starting point. Another possibility is that credit could be given to Red Hat for introducing so many security features to the mainstream Linux users before any other distribution.

In terms of improving the security of databases it’s probably best to give credit to Kaigai Kohei and the PostgreSQL team for Security Enhanced PostgreSQL [3]. I believe that NEC also deserves some credit for sponsoring Kaigai’s work, I am not sure whether NEC directly sponsored his recent work on SE-PostgreSQL but they certainly sponsored his past work (and are credited on the NSA web site for this).

Oracle’s Linux distribution is based on CentOS and/or Red Hat Enterprise Linux (RHEL). The situation with RHEL is that the source is freely available to everyone but binaries are only available to people who pay for support. CentOS is a free recompile of RHEL and a good choice of a distribution if you want a server with long-term support and don’t want to pay Red Hat (I run many servers on CentOS).

While Matt gets most things right in his post there is one statement that I believe to be wrong, he writes “One of the craziest statements I heard during the talk was that Oracle will only support their products running under a VM if it’s within Oracle VM“. My knowledge of Xen causes me to have great concerns about reliability. My conversations with MySQL people about how intensive database servers are and how they can reveal bugs in the OS and hardware are backed up by my own experience in benchmarking systems. Therefore I think it’s quite reasonable to decline to support software running under someone else’s Xen build in the same way as you might refuse to support software running under a different kernel version (for exactly the same reasons).

Matt however goes on to make some very reasonable requests of Oracle. The demand for native packages of Oracle is significant, I can’t imagine official Debian package support appearing in the near future, but RPM support for RHEL etc would make things easier for everyone (including Oracle).

A better installation process for Oracle would also be a good thing. My observation is that most Oracle installations are not used for intensive work and use database features that are a sub-set of what MySQL offers. I’ve seen a few Oracle installations which have no more than three tables! The installation and management of Oracle is a significant cost factor. For example I used to work for a company that employed a full-time Oracle DBA for a database with only a few tables and very small amounts of access (he spent most of his time watching videos of fights and car crashes that he downloaded from the net). Adding one extra salary for a database is a significant expense (although the huge Oracle license fees may make it seem insignificant).

4

Hot Plug and How to Defeat It

Finally I found the URL of a device I’ve been hearing rumours about. The HotPlug is a device to allow you to move a computer without turning it off [1]. It is described as being created for “Government/Forensic customers” but is also being advertised for moving servers without powering them down.

The primary way that it works is by slightly unplugging the power plug and connecting wires to the active and neutral terminals, then when mains power is no longer connected it supplies power from a UPS. When mains power is re-connected the UPS is cut off.

Australian 240 volt 10 amp mains power plug

Modern electrical safety standards in most countries require that exposed pins of a power plug (other than the earth) be shielded to prevent metal objects or the fingers of young children from touching live conductors. The image above shows a recent Australian power plug which has the active and neutral pins protected with plastic such that if the plug is slightly removed there will be no access to live conductors. I have photographed it resting on a keyboard so that people who aren’t familiar with Australian plugs can see the approximate scale.

I’m not sure exactly when the new safer plugs were introduced, a mobile phone I bought just over three years ago has the old-style plug (no shielding) while most things that I bought since then have it. In any case I expect that a good number of PCs being used by Australian companies have the old style as I expect that some machines with the older plugs haven’t reached their three year tax write-down period.

For a device which has a plug with such shielding they sell kits for disassembling the power lead or taking the power point from the wall. I spoke to an an electrician who assured me that he could with a 100% success rate attach to wires within a power cord without any special tools (saving $149 of equipment that the HotPlug people offer). Any of these things will need to be implemented by a qualified electrician to be legal, and any electrician who has been doing the job for a while probably has a lot of experience working before the recent safety concerns about “working live“.

The part of the web site which concerns moving servers seems a little weak. It seems to be based on the idea that someone might have servers which don’t have redundant PSUs (IE really cheap machines – maybe re-purposed desktop machines) which have to be moved without any down-time and for which spending $500US on a device to cut the power (plus extra money to pay an electrician to use it) is considered a good investment. The only customers I can imagine for such a device are criminals and cops.

I also wonder whether you could get the same result with a simple switch that cuts from one power source to another. I find that it’s not uncommon for brief power fluctuations to cause the lights to flicker but for most desktop machines to not reboot. So obviously the capacitors in the PSU and on the motherboard can keep things running for a small amount of time without mains power. That should be enough for the power to be switched across to another source. It probably wouldn’t be as reliable but a “non-government” organisation which desires the use of such devices probably doesn’t want any evidence that they ever purchased one…

Now given that such devices are out there, the question is how to work around them. One thing that they advertise is “mouse jigglers” to prevent screen-lock programs from activating. So an obvious first step is to not allow jiggling to prevent the screen-saver. Forcing people to re-authenticate periodically during their work is not going to impact productivity much (of course the down-side is that it offers more opportunities for shoulder-surfing authentication methods).

Once a machine is taken the next step is to delay or prevent an attacker from reading the data. If an attacker has the resources of a major government behind them then they could read the bus of the machine to extract data and maybe isolate the CPU and send memory read commands to system memory to extract all data (including the keys for decrypting the hard drive). The only possible defence against that would be to have multiple machines exchanging encrypted heart-beat packets and configured to immediately shut themselves down if all other machines stop sending packets to them. But if defending against an attacker with more modest resources the shutdown period could be a lot longer (maybe a week without a successful login).

Obviously an attacker who gets physical ownership of a running machine will try and crack it. This is where all the OS security features we know can be used to delay them long enough to allow an automated shut-down that will remove the encryption keys from memory.

7

Linux Resource Controls

Using the “ulimit” controls over process resource use it is possible to limit RAM for processes and to limit the number of processes per UID. The problem is that this often is only good for accidental problems not dealing with malicious acts.

For a multi-user machine each user needs to be allowed to have two processes to be able to do anything (IE the shell and a command that they execute). A more practical limit is five processes for a single shell session (one or two background jobs, a foreground job where one process pipes data to another, and the shell). But even five processes is rather small (a single Unix pipe can have more than that). A shell server probably needs a limit of 20 processes per user if each user will have the possibility of running multiple logins. For running the occasional memory intensive process such as GCC the per-process memory limit needs to be at least 20M, if the user was to compile big C++ programs then 100M may be needed (I’ve seen a G++ process use more than 90M of memory when compiling a KDE source file). This means that a single user who can launch 20 processes which can each use 20M of memory could use 400M of memory, if they have each process write to a pages in a random order then 400M of RAM would be essentially occupied by that user.

If a shell server had 512M of RAM (which until quite recently was considered a lot of memory – the first multi-user Linux machine I ran on the net had 4M of RAM) then 400M of that could be consumed by a single hostile user. Leaving 100M for the real users might make the machine unusable. Note that the “hostile user” category also encompasses someone who gets fooled by the “here’s a cool program you should run” trick (which is common in universities).

I put my first SE Linux Play Machine [1] on the net in the middle of 2002 and immediately faced problems with DOS attacks. I think that the machine had 128M of RAM and because the concept was new (and SE Linux itself was new and mysterious) many people wanted to login. Having 20 shell users logged in at one time was not uncommon, so a limit of 50 processes for users was minimal. Given that GCC was a necessary part of the service (users wanted to compile their own programs to test various aspects of SE Linux) the memory limit per process had to be high. The point of the Play Machine was to demonstrate that “root” was restricted by SE Linux such that even if all Unix access control methods failed then SE Linux would still control access (with the caveat that a kernel bug still makes you lose). So as all users logged into the same account (root) the process limit had to be adequate to handle all their needs, 50 processes was effectively the bare minimum. 50 processes with 5M of memory each is more than enough to cause a machine with 128M of RAM to swap to death.

One thing to note is that root owned system processes count towards the ulimit for user processes as SE Linux does not have any resource usage controls. The aim of the SE Linux project is access control not protection against covert channels [2]. This makes it a little harder to restrict things as the number of processes run by daemons such as Postfix varies a little over time so the limits have to be a little higher to compensate, while Postfix is run with no limits the processes that it creates apply to the global limit when determining whether user processes can call fork().

So it was essentially impossible to implement any resource limits on my Play Machine that would prevent a DOS. I changed the MOTD (message of the day – displayed at login time) to inform people that a DOS attack is the wrong thing to do. I implemented some resource limits but didn’t seriously expect them to help much (the machine was DOSed daily).

Recently I had a user of my Play Machine accidentally DOS it and ask whether I should install any resource limits. After considering the issue I realised that I can actually do so in a useful manner nowadays. My latest Play Machine is a Xen DomU which I have now assigned 300M of RAM, I have configured the limit for root processes to be 45, as the system and my login comprise about 30 processes that leaves 15 for unprivileged (user_r) logins. Of recent times my Play Machine hasn’t been getting a lot of interest, having two people logged in at the same time is unusual so 15 processes should be plenty. Each process is limited to 20M of memory so overflowing the 300M of RAM should take a moderate amount of effort.

Recently I intentionally have not used swap space on that machine to save on noise when there’s a DOS attack (on the assumption that the DOS attack would succeed regardless of the amount of swap). Now that I have put resource limits in place I have installed 400M of swap space. A hostile user can easily prevent other unprivileged users from logging in by keeping enough long-running processes active – but they could achieve the same goal by having a program kill users shells as soon as they login (which a few people did in the early days). But it should not be trivial for them to prevent me from logging in via a simple memory or process DOS attack.

Update: It was email discussion with Cam McKenzie that prompted this blog psot.