5

selinux-activate

I have written a script for Debian named selinux-activate which is included in selinux-basics version 0.3.3+nmu1 (which I have uploaded to Debian/Unstable). The script when run with no parameters will change the GRUB configuration to include selinux=1 on the kernel command-line and enable SE Linux support in the PAM modules for login, gdm, and kdm. One issue with this is that if you run the command before installing kdm or gdm then you won’t have the pam configuration changed – but as it’s OK to run the script multiple times this shouldn’t be a problem.

The new selinux-basics package will also force a reboot after relabelling all filesystems. I had tested “umount -a ; reboot -f” but discovered that “reboot -f” causes filesystem corruption in some situations (my EeePC running an encrypted LVM volume on an SD card had this problem). So I now use a regular “reboot“.

If no-one points out any serious flaws I plan to ask the release team to include this version of selinux-basics in Lenny. I believe that it will make it significantly easier to install SE Linux while also reducing the incidence of systems being damaged due to mistakes. If you edit the GRUB configuration file by hand then there is a risk of a typo making a system unbootable.

The package in question is already in my Lenny repository, see my previous post about Lenny SE Linux for details [1].

5

Installing SE Linux on Lenny

Currently Debian/Lenny contains all packages needed to run SE Linux apart from the policy. The policy package is missing because it needs to sit in unstable for a while before migrating to testing (Lenny), and I keep fixing bugs and uploading new versions.

I have set up my own APT repository for SE Linux packages (as I did for Etch [1]). The difference is that it’s working now (for i386 and AMD64) while I released my Etch repository some time after the release of Etch.

gpg --keyserver hkp://subkeys.pgp.net --recv-key F5C75256
gpg -a --export F5C75256 | apt-key add –

To enable the use of my repository you must first run the above two commands to retrieve and install my GPG key (take appropriate measures to verify that you have the correct key).

deb http://www.coker.com.au lenny selinux

Then add the above line to /etc/apt/sources.list and run “apt-get update” to download the list of packages.

Next run the command “apt-get install selinux-policy-default selinux-basics” to install all the necessary packages and then “touch /.autorelabel” to cause the filesystems to be labeled on the next boot. Edit the file /boot/grub/menu.lst and add “selinux=1” to the end of the line which starts with “# kopt=” and then run the command update-grub to apply this change.

Then reboot and the filesystems will be relabeled. Init will be running in the wrong context so you have to reboot again before everything is running correctly (I am thinking of having the autorelabel process automatically do the second reboot).

For future reference please use the page on my documents blog – I will update it regularly as needed [2]. This post will not be changed when it becomes outdated in a few days.

4

SE Linux in Lenny Status

SE Linux is almost ready to use in Lenny. Currently I am waiting on the packages libsepol1 version 2.0.30-2, policycoreutils 2.0.49-3, and selinux-policy-default version 0.0.20080702-4 to make their way to testing. The first two should get there soon, the policy will take a little longer as I just made a new upload today (to make it correctly depend on libsepol1 and also some policy fixes).

Update: libsepol1 version 2.0.30-2 and policycoreutils 2.0.49-3 are now in Lenny (testing). Now I’m just waiting for the policy.

Ideally we would be able to pin the apt repositories to take just the packages we want from Unstable (here is a document on how it’s supposed to work [1]). That doesn’t work, so I also tried setting “APT::Default-Release “stable”;” in /etc/apt/apt.conf (as suggested on IRC). This gave better results than pinning (which seems to not work at all) but it still wanted to take unreasonably large numbers of packages from unstable.

Currently to get SE Linux in Lenny (Testing) working you must first upgrade everything to the testing versions, then install libsepol1 from Unstable (this is really important as until a few hours ago the Policy packages in Unstable didn’t depend on it). Then you install policycoreutils and finally the policy package which will be selinux-policy-default for almost everyone – I have not tested the MLS package (selinux-policy-mls) and it’s quite likely that it won’t work well.

The policycoreutils package has a bug related to Python libraries [2] which I don’t know how to fix. Any advice would be appreciated. It’s obvious that the package name needs to not contain a hyphen, but what the name should be and where the files should be stored. The release team have been pretty cooperative with my requests so far to get broken things fixed, hopefully I’ll find a solution to this (and the other similar issues) soon enough to avoid any great inconvenience to them. I’m sure that they will agree that significantly broken packages (which have syntax errors in scripts) need to be fixed before release.

There are also some last minute policy issues that need to be fixed. To properly test this I’m now running the server for my blog and mail server on Lenny with SE Linux. I think that I’m only one policy bug away from running in enforcing mode.

While the situation is pretty difficult at the moment (I’ve had a report forwarded to me from an OLS delegate who tried Lenny SE Linux with the older policy packages and got a bad result), I believe that once Lenny is released we will have the best ever support for SE Linux.

The Debian security team recently released an update to the SE Linux policy packages to match the recent updates to BIND [3]. I was grateful that they did this – and without any significant involvement from me. I was asked to advise on the patch that they had written, I confirmed that it looked good (which took hardly any effort), and they did the rest (which appears to be a moderate amount of work). Given the situation it would have been understandable if they had decided that it was something that could be worked around.

I expect that SE Linux on Lenny will get more users than on Etch, so therefore more issues of this nature will be discovered so I expect to have more interaction with the Debian security group in future.

4

Biba and BLP for Network Services

Michael Janke has written an interesting article about data flows in networks [1], he describes how data from the Internet should be considered to have low integrity (he refers to it as “untrusted”) and that as you get closer to the more important parts of the system it needs to be of higher integrity.

It seems to me that his ideas are very similar in concept to the Biba Integrity Model [2]. The Biba model is based around the idea that a process can only write data to a resource that is of equal or lower integrity and only read data from a resource that is of equal or higher integrity, this is often summarised as “no read-down and no write-up“. In a full implementation of Biba the OS would label all data (including network data) as to it’s integrity level and prevent any communication that violates the model (except of course for certain privileged programs – for example the file or database that stores user passwords must have high integrity but any user can run the program to change their password). A full Biba implementation would not work for a typical Internet service, but considering some of the concepts of Biba while designing an Internet service should lead to a much better design (as demonstrated in Michael’s post).

While considering the application of Biba to network design it makes sense to also consider consider the Bell LaPadula model (BLP) [3]. In computer systems designed for military use a combination of Biba and BLP is not uncommon, while a strict combination of those technologies would be an almost insurmountable obstacle to development of Internet services I think it’s worth considering the concepts.

BLP is a system that is primarily designed around the goal of protecting data confidentiality. Every process (subject) has a sensitivity label (often called a “clearance”) which is comprised of a sensitivity level and a set of categories and every resource that a process might access (object) also has a sensitivity label (often called a “classification”). If the clearance of the subject dominates the classification of the object (IE the level is equal or greater and the set of categories is a super-set) then read access is permitted, if the clearance of the subject is dominated by the classification of the object then write access is permitted, and the clearance and classification have to be equal for read/write access to be permitted. This is often summarised as “no write-down and no read-up“.

SGI has published a lot of documentation for their Trusted Irix (TRIX) product on the net, the section about mandatory access control covers Biba and BBLP [4]. I recommend that most people who read my blog not read the description of how Biba and BLP works, it will just give you nightmares.

The complexity of either Biba or BLP (including categories) is probably too great for consideration when designing network services which have much lower confidentiality requirements (even the loss of a few million credit card numbers is trivial compared to some of the potential results of leaks of confidential military data). But a simpler case of BLP with only levels is worth considering. You might have credit card numbers stored in a database classified as “Top Secret” and not allow less privileged processes to read from it. The data about customers addresses and phone numbers might be classified as “Secret” and all the other data might merely be “Classified”.

One way of using the concepts of Biba and BLP in the design of a complex system would be to label every process and data store in the system according to it’s integrity and classification/clearance. Then for the situations where data flows to processes with lower clearance the code could be well designed and audited to ensure that it does not leak data. For situations where data of low integrity (EG data from a web browser) is received by a process of high integrity (EG the login screen) the code would have to be designed and audited to ensure that it correctly parsed the data and didn’t allow SQL injection or other potential attacks.

I expect that many people who have experience with Biba and BLP will be rolling their eyes while reading this. The situation that we are dealing with in regard to PHP and SQL attacks over the Internet is quite different to the environments where proper implementations of Biba and BLP are deployed. We need to do what we can to try and improve things, and I think that the best way of improving things in terms of web application security would involve thinking about clearance and integrity as separate issues in the design phase.

SE Linux Policy Loading

One of the most significant tasks performed by a SE Linux system is loading the “policy“. The policy is the set of rules which determine what actions are permitted by each domain.

When I first started using SE Linux (in 2001) the kernel knew where to find the policy file and would just read the data from disk as soon as it had mounted the root filesystem. Doing such things is generally considered to be a bad idea, but it was an acceptable mechanism for an early release.

One issue is that the policy needs to be loaded very early in the system boot process, before anything significant happens. In the early days the design of SE Linux had no support for a process to change it’s security context other than by executing another process (similar to the way a non-root process in the Unix access control system can not change it’s UID, GID, or groups). Although later on support for this was added, it was only available as the request of the application (an external process could not change the context of an application without using ptrace – a concept that is too horrible to contemplate) and I am not aware of anyone actually using it. So it’s almost a requirement that there be no more than one active process in the system at the time that policy is loaded, therefore it must be init or something before init that loads the policy.

When it was decided that a user-space program had to instruct the kernel to load the policy we had to determine which program should do it and when it should be done, with the constraint that it had to be done early. The most obvious solution to this problem was to load the policy in the initramfs (or initrd as it was known at the time). One problem with this is that the initramfs is limited in size by kernel compilation options and may need to be recompiled to fit a bigger policy. As an experiment to work around this limitation I had a small policy (which covered the domains for init and the scripts needed for the early stages for system boot) loaded in the initramfs and then later in the boot process (some time after the root filesystem was mounted read/write) the full policy was loaded.

A more serious problem with including policy in the initramfs was that it required rebuilding the initramfs every time the policy changed in a significant way, of course scripts could not determine when a change was significant (neither could most users) so that required needless rebuilds (which wastes time). Even with a small policy for early booting loaded it was still sometimes necessary to change it and update the initramfs. I believe that as a general rule an initramfs should only be rebuilt when a new kernel is installed or when a radical change is made to the boot process (EG moving from single disk to software RAID, changing between AMD and Intel CPU architecture, changing SCSI controller, or anything else that would make the old initramfs not boot the machine). The initramfs that was used to boot my machine is known to actually work, the same can not be said for any new initramfs that I might generate.

But the deciding factor for me was support of machines that did not use an initramfs or initrd (such as the Cobalt machines [1] I own).

To solve these problems I first experimented with a wrapper for init. The idea was to divert the real init to another file name (or use the init= kernel command-line option) and then have the wrapper load the policy before running the real init. I never intended that to be a real solution, just to demonstrate a point. Once I had proven that it was possible to load the policy from user-space before running the real init program it was a small step to patch init to do this.

One slightly tricky aspect of this was in getting the correct security context for init. The policy has always been written to allow a domain transition from kernel_t to init_t when a file of type init_exec_t is executed. The domain kernel_t is applied to all active processes (including kernel threads) at the time the policy is loaded. So init only has to re-exec itself to get the correct context. Fortunately init is designed to do this in the case of an upgrade so this was easy to manage.

Since that time every implementation of SE Linux apart from some embedded systems has used init to load the policy.

The latest trend in Linux distributions seems to be using upstart [2] as a replacement for the old SysV Init. The Fedora developers decided to make nash (a program that comes from the mkinitrd source tree in Fedora and is a utility program for a Red Hat based initramfs) load the SE Linux policy as it would apparently be painful to patch every init to load the policy.

As far as I am aware there are only three different init programs in common use in Linux, the old SysV Init (which used to be used by everyone), Busybox (for rescuing broken systems and for embedded systems), and now Upstart (used by Ubuntu and Red Hat since Fedora 9). Embedded systems need to work differently to other systems in many ways (having the one Busybox program supply the equivalent to most coreutils in one binary is actually a small difference compared to the other things), and modifying the policy load process for embedded systems is trivial compared to all the other SE Linux work needed to get an embedded system working well. There are at least two commonly used initramfs systems (the Debian and Red Hat ones) and probably others. As one init system (SysV Init) already has SE Linux support it seems that only one needs to be patched to have complete coverage. I’ve just written a patch for Upstart (based on the version in Debian/Experimental) and sent it to an Ubuntu developer who’s interested in such things. I also volunteer to patch any other init system that is included in Debian (I am aware of minit and will patch it as soon as it’s description does not include “this package is experimental and not easy to install and use“).

It seems to me that repeating the work which was done for SysV Init and upstart for any other init system will be little effort, at worst no greater than patching an initramfs systems (and I’ll do it). As the number of initramfs systems that would need to be patched would exceed the number of init systems it seems that less work is involved in patching the init systems.

The amount of RAM required by the initramfs is in some situations a limitation on the use of a system, when I recently did some tests on swap performance by reducing the amount of RAM available to a Xen DomU [3] it was the initramfs that limited how small I could go. So adding extra code to the initramfs is not desired. While this will be a small amount of code in some situations (when I patched /sbin/init from Upstart it took an extra 64 bytes of disk space on AMD64), dragging in the libraries can take a moderate amount of space (the fact that an LVM or encrypted root filesystem causes SE Linux libraries to be included in the initramfs is something that I consider to be a bug and plan to fix).

Finally not all boot loaders support an initrd or initramfs. I believe that any decision which prevents using such sweet hardware as Cobalt Qube and Raq machines from being used with SE Linux is a mistake. I have both Qube and Raq machines running fine with Debian SE Linux and plan to continue making sure that Debian will support SE Linux on such hardware (and anything with similar features and limitations).

5

New SE Linux Policy for Lenny

I have just uploaded new SE Linux policy packages for Debian/Unstable which will go into Lenny (provided that the FTP masters approve the new packages in time).

The big change is that there are no longer separate packages for strict and targeted policies. There is now a package named selinux-policy-default which has the features of both strict and targeted. When you install it you get the features of targeted. If you want the strict features then you need to run the following commands as root:

semanage login -m -s user_u __default__
semanage login -m -s root root

Then you can logout and login and you get the main benefit of the strict policy (users being constrained). IE you can convert from targeted to strict without a reboot! The above only changes the access for user login sessions (and cron jobs). To fully convert to the strict policy you need to remove the unconfined module with the command “semodule -r unconfined“, currently that results in a system that doesn’t boot – I’m working on this and will have it fixed before Lenny. Also it’s possible to have some users unconfined and some restricted in the way that strict policy always did.

When running in the full strict configuration you need to run the command “newrole -r sysadm_r” immediately after logging in as root. When you login you default to staff_r which doesn’t give you the access needed to perform routine sys-admin tasks.

Due to the change in the function of the policy packages (in terms of not having a strict package) it made sense to revise the naming (Fedora 9 has a package named selinux-policy-targeted which also provides the strict configuration – I don’t want to do that and don’t have as much legacy as Fedora). This is why I decided to not have package names that include the word “policy” twice. Of course all policy packages get new names, but the ones that matter needed new names anyway.

Another new feature is the package selinux-policy-mls, as the name suggests this implements Multi Level Security [1]. I don’t expect that the MLS policy will boot in enforcing mode in a regular configuration at this time (you could probably hack it to boot in permissive mode and switch to enforcing mode just before it starts networking). I uploaded it in this state so that people can start testing it (there is a lot of testing that you can do in permissive mode) and so that it can get added to the package list in time for Lenny. I expect that I’ll have it booting shortly (it should not be much more difficult than getting the strict configuration booting).

In terms of the use of MLS, I don’t expect that anyone will want to pay the money needed for LSPP [2] certification. NB The wikipedia page about LSPP really needs some work.

I believe that the main benefit for having MLS in Debian is for the use of students. I periodically get requests from students for advice on how to get a job related to military computer security. Probably the best advice I can offer is to visit the career section of an agency from your government that works on computer security issues, for US readers the NSA careers page is here [3]. The second best advice I can offer is to work on MLS support in your favourite free OS. Not only will you learn about technology that is used in military systems but you will also learn a lot about how your OS works as MLS breaks things. ;)

Finally I’d like to thank Manoj for all his work. For a while I didn’t have time to do much work on SE Linux and he did a lot of good work. Recently he seems to have been busy on other things and I’ve had a little more time so I’m taking over some of it.

9

Is a GPG pass-phrase Useful?

Does a GPG pass-phrase provide a real benefit to the majority of users?

It seems that there will be the following categories of attack which result in stealing the secret-key data:

  1. User-space compromise of account (EG exploiting a bug in a web browser or IRC client).
  2. System compromise (EG compromising a local account and exploiting a kernel vulnerability to get root access).
  3. Theft of the computer system while powered down when the system was configured to not use swap or to encrypt the swap space with a random key at boot time.
  4. Theft of a computer system while running or that did not have encrypted swap.
  5. Theft of unencrypted backup media.

Category 1 will permit an attacker to monitor user processes and intercept one that asks for a GPG pass-phrase as well as to copy the secret key. Category 2 will do the same but for all users on the system.

Category 3 will give the potential for stealing the private key (if it’s not encrypted) but no direct potential for getting the pass-phrase.

Category 4 has the potential for copying a pass-phrase from memory or swap. I am inclined to trust Werner Koch (and anyone else who submitted code to the GPG project) to have written code to correctly lock memory and scrub pass-phrase data and decrypted private key data from memory after use. But I really doubt the ability of most people who write code to interface with GPG to do the same. So every time that a GUI program prompts for a GPG pass-phrase I think that there is the potential for it to be stored in swap or to remain indefinitely in RAM. Therefore stealing a machine that does not have it’s swap-space encrypted with a random key (which is the most practical way of encrypting swap) or stealing a running machine (as mentioned in a previous post [1]) can potentially grant a hostile party access to the pass-phrase.

So it seems to me that out of all the possible ways of getting access to a GPG private key, the only ones where a pass-phrase ones is going to really do some good are categories 3 and 5. While it’s good to protect against those situations, it seems to me that the greatest risk to a GPG key is from category 1, with category 2 following close behind.

I previously wrote about the slow progress towards using SE Linux and GPG code changes to make it more difficult to steal the secret key [2] – something that I’ve been occasionally working on over the last 6 years.

Now it seems to me that the same benefits can and should be made available to people who don’t use SE Linux. If a system directory such as /var/spool/gpg was mode 1770 then gpg could be setgid to group “gpg” so that it could create and access secret keys for users under /var/spool/gpg while the users in question could not directly access them. Then the sys-admin would be responsible for backing up GPG keys. Of course it would probably be ideal to have an option as to whether a new secret key would be created in the system spool or in the user home directory, and migrating the key from the user home directory to the system spool would be supported (but not migrating it back).

This would mean that an attacker who compromised a local user account (maybe through a vulnerability in a web browser or MUA) would not be able to get the GPG secret key. They could probably get the pass-phrase by ptracing the MUA (or some other GUI process that calls GPG) but without the secret key itself that would not do as much good – of course once they had the pass-phrase and local access they could use the machine so sign and decrypt data which would still be a bad thing. But it would not have the same scope as stealing the secret key and the pass-phrase.

I look forward to reading comments on this post.

12

Kernel Security vs Uptime

For best system security you want to apply kernel security patches ASAP. For an attacker gaining root access to a machine is often a two step process, the first step is to exploit a weakness in a non-root daemon or take over a user account, the second step is to compromise the kernel to gain root access. So even if a machine is not used for providing public shell access or any other task which involves giving user access to potential hostile people, having the kernel be secure is an important part of system security.

One thing that gets little consideration is the overall effect of applying security updates on overall uptime. Over the last year there have been 14 security related updates (I count a silent data loss along with security issues) to the main Debian Etch kernel package. Of those 14, it seems that if you don’t use DCCP, NAT for CIFS or SNMP, IA64, the dialout group, then you will only need to patch for issues 2, 3 (for SMP machines), 4, 5, 7 (sound drivers get loaded on all machines by default), 9, 10, 11, 12, 13, and 14.

This means 11 reboots a year for SMP machines and 10 a year for uni-processor machines. If a reboot takes three minutes (which is an optimistic assumption) then that would be 30 or 33 minutes of downtime a year due to kernel upgrades. In terms of uptime we talk about the number of “nines”, where the ideal is generally regarded as “five nines” or 99.999% uptime. 33 minutes of downtime a year for kernel upgrades means that you get 99.993% uptime (which is “four nines”). If a reboot takes six minutes (which is not uncommon for servers) then it’s 99.987% uptime (“thee nines”).

While it doesn’t seem likely to affect the number of “nines” you get, not using SMP has the potential to avoid future security issues. So it seems that when using a Xen (or other virtualisation technology) assigning only one CPU to the DomUs that don’t need any more could improve uptime for them.

For Xen Dom0’s which don’t have local users or daemons, don’t use DCCP, NAT for CIFS or SNMP, wireless, CIFS, JFFS2, PPPoE, bluetooth, H.323 or SCTP connection tracking, then only issue 11 applies. However for “five nines” you need to have 5 minutes of downtime a year or less. It seems unlikely that a busy Xen server can be rebooted in 5 minutes as all the DomUs need to have their memory saved to disk (writing out the data to disk and reading it back in after a reboot will probably take at least a couple of minutes) or they need to be shutdown and booted again after the Dom0 is rebooted (which is a good procedure if the security fix affects both Dom0 and DomU use), and such shutdowns and reboots of DomU’s will take a lot of time.

Based on the past year, it seems that a system running as a basic server might get “four nines” if configured for a fast boot (it’s surprising that no-one seems to be talking about recent improvements to the speed of booting as high-availability features) and if the boot is slower then you are looking at “three nines”. For a Xen server unless you have some sort of cluster it seems that “five nines” is unattainable due to reboot times if there is one issue a year, but “four nines” should be easy to get.

Now while the 14 issues over the last year for the kernel seems likely to be a pattern that will continue, the one issue which affects Xen may not be representative (small numbers are not statistically significant). I feel confident in predicting a need for between 5 and 20 kernel updates next year due to kernel security issues, but I would not be prepared to bet on whether the number of issues affecting Xen will be 0, 1, or 4 (it seems unlikely that there would be 5 or more).

I will write a future post about some strategies for mitigating these issues.

Here is my summary of the Debian kernel linux-image-2.6.18-6-686 (Etch kernel) security updates according to it’s changelog, they are not in chronological order, it’s the order of the changelog file:
Continue reading

10

SE Linux Support in GPG

In May 2002 I had an idea for securing access to GNUPG [1]. What I did was to write SE Linux policy to only permit the gpg program to access the secret key (and other files in ~/.gnupg). This meant that the most trivial ways of stealing the secret key would be prevented. However an attacker could still use gpg to encrypt it’s secret key and write the data to some place that is accessible, for example the command “gpg -c --output /tmp/foo.gpg ~/.gnupg/secring.gpg“. So what we needed was for gpg to either refuse to encrypt such files, or to spawn a child process for accessing such files (which could be granted different access to the filesystem). I filed the Debian bug report 146345 [2] requesting this feature.

In March upstream added this feature, the Debian package is currently not built with --enable-selinux-support so this feature isn’t enabled yet, but hopefully it will be soon. Incidentally the feature as currently implemented is not really SE Linux specific, it seems to me that there are many potential situations where it could be useful without SE Linux. For example if you were using one of the path-name based MAC systems (which I dislike – see what my friend Joshua Brindle wrote about them for an explanation [3]) then you could gain some benefits from this. A situation where there is even smaller potential for benefit is in the case of an automated system which runs gpg which could allow an attacker to pass bogus commands to it. When exploiting a shell script it might be easier to specify the wrong file to encrypt than to perform more sophisticated attacks.

When the feature in question is enabled the command “gpg -c --output /tmp/foo.gpg ~/.gnupg/secring.gpg” will abort with the following error:
gpg: can’t open `/root/.gnupg/secring.gpg’: Operation not permitted
gpg: symmetric encryption of `/root/.gnupg/secring.gpg’ failed: file open error

Of course the command “gpg --export-secret-keys” will abort with the following error:
gpg: exporting secret keys not allowed
gpg: WARNING: nothing exported

Now we need to determine the correct way of exporting secret keys and modifying the GPG configuration. It might be best to allow exporting the secret keys when not running SE Linux (or other supported MAC systems), or when running in permissive mode (as in those situations merely copying the files will work). Although we could have an option in gpg.conf for this for the case where we want to prevent shell-script quoting hacks.

For editing the gpg.conf file and exporting the secret keys we could have a program similar in concept to crontab(1) which has PAM support to determine when it should perform it’s actions. Also it seems to me that crontab(1) could do with PAM support (I’ve filed Debian bug report 484743 [4] requesting this).

Finally one thing that should be noted is that the targeted policy for SE Linux does not restrict GPG (which runs in the unconfined_t domain). Thus most people who use SE Linux at the moment aren’t getting any benefits from such things. This will change eventually.

15

IPSEC is Pain

I’ve been trying to get ipsec to work correctly as a basic VPN between two CentOS 5 systems. I set up the ipsec devices according to the IPSEC section of the RHEL4 security guide [1] (which is the latest documentation available and it seems that nothing has changed since). The documentation is quite good, but getting IPSEC working is really difficult. One thing I really don’t like about IPSEC is the way that it doesn’t have a device, I prefer to have my VPN have it’s own device so that I can easily run tcpdump on the encrypted or the unencrypted stream (or two separate tcpdump sessions) if that’s necessary to discover the cause of the problem).

I’ve got IPSEC basically working, and I probably could get it working fully, but it doesn’t seem worth the effort.

While fighting with IPSEC at the more complex site (multiple offices which each have multiple networks, and switching paths to route around unreliable networks) I set up an IPSEC installation at a very simple site (two offices within 200 meters both with static public IP addresses, no dynamic routing or anything else complex). The simple site STILL doesn’t work as well as desired, one issue that still concerns me is the arbitrary MTU sizes in some routing gear which (for some reason I haven’t diagnosed yet) lose packets if I have an MTU over 1470 bytes.

So today I set up a test network with OpenVPN [2]. It was remarkably simple, the server config file (/etc/openvpn/server.conf) is as follows:
dev tun
ifconfig 10.8.0.1 10.8.0.2
secret static.key
comp-lzo

This means that the IP address 10.8.0.1 will be used for the “server” end of the tunnel, and 10.8.0.2 is the “client” end. The secret is stored in /etc/openvpn/static.key (which is the same on both machines and is generated by “openvpn --genkey --secret static.key“).

The client config file (/etc/openvpn/client.conf) is as follows:
remote 10.5.0.2
dev tun
ifconfig 10.8.0.2 10.8.0.1
secret static.key
comp-lzo

Then I enable IP forwarding on both VPN machines, open UDP port 1194 (the command “lokkit -q -p 1194:udp” does this) and start the daemon on each end. The script /etc/init.d/openvpn (in Dag Wieers package for CentOS 5 – which I believe is the standard script) will just take every file matching /etc/openvpn/*.conf as a service to start.

The end result is a point to point link that I can easily route other packets to, I can easily get dynamic routing daemons to add routes pointing to it. Nothing like the IPSEC configuration where the config file needs to have the IP address range hard-coded, I can just add routes whenever I want.

This isn’t necessarily going to be the way I deploy it. The documentation notes that a disadvantage is “lack of perfect forward secrecy — key compromise results in total disclosure of previous sessions“. But what gives me confidence is the fact that it was so easy to get it going, if I have problems in adding further features to the configuration it should be easy to debug. As opposed to IPSEC where it’s all complex and if it doesn’t work then it’s all pain.

Also I tested this out with four Xen DomU’s, two running as VPN routers and two as clients on separate segments of the VPN. They were connected with three bridge devices. I’ll blog about how to set this up if there is interest.