11

Can you run SE Linux on a Xen Guest?

I was asked “Can you run SELinux on a XEN guest without any problem?“. In a generic sense the answer is of course YES, Xen allows you to run Linux kernels with all the usual range of features and SE Linux isn’t a particularly difficult feature to enable. I do most of my SE Linux development and testing on virtual machines and until recently I didn’t have any hardware suitable for running KVM, so in the last few years I’ve done more SE Linux testing on Xen than on non-virtual machines. My SE Linux Play Machine [1] (which will be online again tomorrow) is one SE Linux system running under Xen.

But the question was asked in the context of my blog post comparing the prices of virtual hosting providers [2], which changes things.

Both Linode and Slicehost (the two virtual hosting providers that my clients use) provide kernels without SE Linux support, the command “grep selinux /proc/filesystems” (which is the easiest way to test for SE Linux support) gives no output. I am not aware of any other virtual hosting company that provides SE Linux support.

If anyone knows of a virtual hosting company that runs Xen or KVM virtual machines with SE Linux support then please let me know, I’ll write a blog post comparing such companies if there are some.

For the people who work at ISPs: If your company supports SE Linux virtual machines then I would be happy to review your service, just give me a free DomU for a couple of weeks so I can test it out. If your company is considering offering such virtual machines then I would be happy to have a confidential discussion about the issues that you will face, while I am available for paid consulting work in this area I am more than happy to spend an hour or two helping a company that’s going to help support my favorite free software project without expecting to be paid. But I have to note that if a dozen hosting companies happen to want advice I won’t be able to provide two hours of free advice to each of them.

I think that there is an unsatisfied market demand for SE Linux virtual machines. I don’t expect all virtual hosting companies to support it in the near future, but this will make it more profitable for those that do. If for the sake of discussion we assume that 5% of sysadmins who are making purchasing decisions regarding virtual servers really want to have SE Linux support and if 5% of virtual hosting companies were to offer such support, then those hosting companies would almost double their market share as a result of supporting SE Linux. It’s the usual economic factors relating to small companies that profit from providing good support for the needs of a minority of customers.

Can SE Linux Implement Traditional Unix Users and Groups?

I was asked by email whether SE Linux could implement traditional Unix users and groups.

The Strictly Literal Answer to that Question

The core of the SE Linux access control is the domain-type model where every process has a domain and every object that a process can access (including other processes) has a type. Domains are not strongly differentiated from types.

It would be possible to create a SE Linux policy that created a domain for every combination of UID and GID that is valid for a user shell given that such combinations are chosen by the sysadmin who could limit them to some degree. There are about 2^32 possible UIDs and about 2^32 possible GIDs, as every domain is listed in kernel memory we obviously can’t have 2^64 domains, but we could have enough to map a typical system that’s in use. Of course the possible combinations of supplemental groups probably makes this impossible for even relatively small systems, but we can use a simpler model that doesn’t emulate supplemental groups.

For files there are more possible combinations because anyone who is a member of a group can create a SETGID directory and let other users create files in it. But in a typical system the number of groups is not much greater than the number of users – the maximum number of groups is typically the number of users plus about 60. So if we had 100 users then the number of combinations of UID and GID would be something like 100*(100+60)=16,000 – it should be possible to have that many domains in a SE Linux policy (but not desirable).

Then all that would be needed is rules specifying that each domain (which is based on a combination of UID and GID) can have certain types of access to certain other types based on them having either the same UID or the same GID.

Such a policy would be large, it would waste a lot of kernel memory, it would need to be regenerated whenever a user is added, and it’s generally something you don’t want to use. No-one has considered implementing such a policy, I merely describe it to illustrate why certain configuration options are not desirable. The rest of this post is about realistic things that you can do with SE Linux policy and how it will be implemented in Debian/Squeeze.

My previous post titled “Is SE Linux Unixish?” addresses this issue at a more conceptual level and also considers MAC vs DAC [1].

The History of mapping Unix Accounts to SE Linux Access Control

In the early releases of SE Linux (long before it was included in Fedora) every user who could login to a system needed to have their user-name compiled into the policy. The policy specified which roles the user could access, the roles specified which domains could be accessed, and therefore what the user could do. The identity of files on disk was used for two purposes, one was logging (you could see who created a file) and the other was a permission check for the SE Linux patched version of Vixie cron which would not execute any command on behalf of a user unless the identity on the crontab file in the cron spool matched the identity used to run it – this is an analogy of the checks that Vixie cron makes on the Unix UID of the crontab file (some other cron daemons do fewer checks).

Having to recompile policy source every time you added a user was annoying. So a later development was to allow arbitrary mappings between Unix account names and SE Linux Identities (which included a default identity) and another later development was to have a utility program semanage to map particular Unix account and group names to SE Linux identities. This was all done years ago. Fedora Core 5 which was released in 2006 had the modular policy which included these features (apart from mapping Unix groups to SE Linux identities which was more recent).

Fedora Core 5 also introduced MCS which was comprised of a set of categories that a security context may have. The sysadmin would configure the set of categories that each account would have.

A recent development has been a concept named UBAC (User Based Access Control) which basically means that a process running directly on behalf of a regular user (IE with a SE Linux identity that’s not system_u) can only access files that have an identity of system_u or which have the same identity as the process. This means that you can only access your own files or system files – not files of other users which may have inappropriate Unix permissions. So for example if a user with a SE Linux identity of “john” gives their home directory the Unix permission mode of 0777 then a user with a SE Linux identity of “jane” can’t access their files. Of course this means that if you have a group of people working together on a project then they probably need to all have the same Identity and in practice you would probably end up with everyone having the same identity. I’ve given up on the idea of using UBAC in Debian.

The Current Plan for Users and SE Linux Access Control in Debian

My plan is to have things work in Squeeze in much the same way as in Lenny.

You have a SE Linux identity assigned to a login session and everything related to it (including cron jobs) based on the Unix account name or possibly the Unix group name (if there are login entries for both the user-name and the group-name then the user-name entry has precedence). The mapping between Unix accounts and SE Linux identities is configured by the sysadmin and SE Linux identities don’t matter much for the Targeted configuration (which is what most people use).

The identity determines which roles may be used and also has a limit on the MCS categories. The MCS categories are also specified in the login configuration which has to be a sub-set of the categories used by the identity record.

So for example the following is the output of a couple of commands run on a Debian/Unstable system. They show that the “test” Unix account is assigned a SE Linux identity of “staff_u” and an MCS range of “s0-s0:c1” (this means it creates files by default at level “s0” and can also write to other files at that level, but can also have read/write access to files at the level “s0:c1”). The “staff_u” identity (as shown in the output of “semanage user -l” can be used with all categories in the set “s0:c0.c1023” where the dot means the set of categories from c0 to c1023 inclusive) but in the case of the “test” user only one category will be used. The “test” group however (as expressed with “%test”) is given the identity “user_u” and is not permitted to use any categories.

# semanage login -l
Login Name    SELinux User    MLS/MCS Range            
%test         user_u          s0                       
__default__   unconfined_u    s0-s0:c0.c1023           
root          unconfined_u    s0-s0:c0.c1023           
system_u      system_u        s0-s0:c0.c1023           
test          staff_u         s0-s0:c1               

# semanage user -l
             Labeling   MLS/       MLS/                          
SELinux User Prefix     MCS Level  MCS Range         SELinux Roles
root         sysadm     s0         s0-s0:c0.c1023    staff_r sysadm_r system_r
staff_u      staff      s0         s0-s0:c0.c1023    staff_r sysadm_r
sysadm_u     sysadm     s0         s0-s0:c0.c1023    sysadm_r
system_u     user       s0         s0-s0:c0.c1023    system_r
unconfined_u unconfined s0         s0-s0:c0.c1023    system_r unconfined_r
user_u       user       s0         s0                user_r

I hope to get the policy written to support multiple user roles in time for the release of Squeeze. If I don’t make it then I will put such a policy on my own web site and try to get it included in an update. The policy currently basically works for a Targeted configuration where the users are not confined (apart from MCS).

How MMCS Basically Works

The vast majority of SE Linux users run with the MCS policy rather than the MLS policy. For Debian I have written a modified version of MCS that I call MMCS. MMCS is mandatory (you can’t relabel files to escape it) and it prevents write-down.

If a process has the range s0-s0:c1,c3 then it has full access to files labelled as s0, s0:c1, s0:c3, and s0:c1,c3 – and any files it creates will be labeled as s0.

If a process has the range s0:c1-s0:c1,c3 then it has read-only access to files labelled as s0 and s0:c3 and read-write access to files labelled as s0:c1 and s0:c1,c3. This means that any secret data it accesses that was labelled with category c1 can’t be leaked down to a file that is not labelled with that category.

Now MCS currently has no network access controls, so there’s nothing stopping a user from using scp or other network utilities to transfer files. But that’s the way with most usable systems. I don’t think that this is necessarily a problem, the almost total lack of network access controls in a traditional Unix model doesn’t seem to concern most people.

Now to REALLY Answer that Question

SE Linux is a Mandatory Access Control (MAC) system, this makes it inherently different to a Discretionary Access Control (DAC) system such as traditional Unix access controls.

Unix permissions are based on each file having a UID, a GID, and a set of permissions and each process having a UID, a GID, and a set of supplementary GIDs. If a user runs a setuid or setgid program then the process will have extra privileges. It also has a lot of stuff that most people aren’t aware of such as real vs effective UIDs, the tag bit, setgid directories, and lots more – including some quite arbitrary things like making ports <1024 special.

SE Linux is based on every object (process, file, socket, etc) having a single security label which includes an identity, a role, a type, and a sensitivity label (MCS categories or an MLS range). There is no support for an object to have more than one label. The SE Linux equivalent to setuid/setgid files is a label for a file which triggers a domain transition when it’s executed. This differs from setuid files in that the domain transition is complete (the old privileges can’t be restored) and the transition is generally not to a strict super-set of the access (usually a different sub-set of possible access and sometimes to lesser access).

These differences are quite fundamental. So really SE Linux can’t implement traditional Unix access control. What SE Linux is designed to do is to provide a second layer of defense and to also provide access controls that have different aims than that of Unix permissions – such as being mandatory and implementing features such as MLS.

19

Logging in as Root

Martin Meredith wrote a blog post about logging in as root and the people who so strongly advocate against it [1]. The question is whether you should ssh directly to the root account on a remote server or whether you should ssh to a non-root account and use sudo or su to gain administrative privileges.

Does sudo/su make your system more secure?

Some years ago the administrator of a SE Linux Play Machine used the same home directory for play users to login as for administrative logins as for his own logins – he used newrole to gain administrative access (like su or sudo but for SE Linux).

His machine was owned by one of his friends who created a shell function named newrole in one of his login scripts that used netcat to send the administrative password out over the net. He didn’t immediately realise that this was a problem until his friend changed the password and locked him out! This is one example of a system being 0wned due to having the double-authentication – of course if he had logged in directly with administrative privs while using the same home directory that the attacker could write to then he would still have lost but the attacker would have had to do a little more work.

When you login you have lots of shell scripts run on your behalf which have the ability to totally control your environment, if someone has taken over those scripts then they can control everything you see, when you think you run sudo or something they can get the password. When you ssh in to a server your security relies on the security of the client end-point, the encryption of the ssh protocol (including keeping all keys secure to prevent MITM attacks), and the integrity of all the programs that are executed before you have control of the remote system.

One benefit for using sshd to spawn a session without full privileges is in the case where you fear an exploit against sshd and are running SE Linux or some other security system that goes way beyond Unix permissions. It is possible to configure SE Linux in the “strict” configuration to deny administrative rights to any shell that is launched directly by the sshd. Therefore someone who cracks sshd could only wait until an administrator logs in and runs newrole and they wouldn’t be able to immediately take over the system. If the sysadmin suspected that a sshd compromise is possible then a sysadmin could login through some other method (maybe visit the server and login at the console) to upgrade the sshd. This is however a very unusual scenario and I suspect that most people who advocate using sudo exclusively don’t use a SE Linux strict configuration.

Does su/sudo improve auditing?

If you have multiple people with root access to one system it can be difficult to determine who did what. If you force everyone to use su or sudo then you will have a record of which Unix account was used to start the root session. Of course if multiple people start root shells via su and leave them running then it can be difficult to determine which of the people who had such shells running made the mistake – but at least that reduces the list of suspects.

If you put “PermitUserEnvironment yes” in /etc/ssh/sshd_config then you have the option of setting environment variables by ssh authorized_keys entries, so you could have an entry such as the following:

environment=”ORIG_USER=john@example.com” ssh-rsa AAAAB3Nz[…]/w== john@example.com

Then you could have the .bashrc file (or a similar file for your favorite shell) have code such as the following to log the relevant data to syslogd:
if [ "$SSH_TTY" = "" ]; then
  logger -p auth.info "user $ORIG_USER ran command \"$BASH_EXECUTION_STRING\" as root"
else
  logger -p auth.info "user $ORIG_USER logged in as root on tty $(tty)"
fi

I think that forcing the use of su or sudo might improve the ability to track other sysadmins if the system is not well configured. But it seems obvious that the same level of tracking can be implemented in other ways with a small amount of effort. It took me about 30 minutes to devise the above shell code and configuration options, it should take people who read this blog post about 5 minutes to implement it (or maybe 10 minutes if they use a different shell or have some other combination of Bash configuration that results in non-obvious use of initialisation scripts (EG if you have a .bash_profile file then .bashrc may not be executed).

Once you have the above infrastructure for logging root login sessions it wouldn’t be difficult to run a little script that asks the sysadmin “what is the purpose for your root login” and logs what they type. If several sysadmins are logged in at the same time and one of them describes the purpose of their login as “to reconfigure LDAP” then you know who to talk to if your LDAP server stops working!

Should you run commands with minimum privilege?

It’s generally regarded that running each command with the minimum privilege is a good idea. But if the only reason you login to a server is to do root tasks (restarting daemons, writing to files that are owned by root, etc) then there really isn’t a lot of potential for achieving anything by doing so. If you need to use a client for a particular service (EG a web browser to test the functionality of a web server or proxy server) then you can login to a different account for that purpose – the typical sysadmin desktop has a dozen xterms open at once, using one for root access to do the work and another for non-root access to do the testing is probably a good option.

Can root be used for local access?

Linux Journal has an article about the distribution that used to be known as Lindows (later Linspire) which used root as the default login for desktop use [2]. It suggests using a non-root account because “If someone can trick you into running a program or if a virus somehow runs while you are logged in, that program then has the ability to do anything at all” – of course someone could trick you into running a program or virus that attempts to run sudo (to see if you enabled it without password checks) and if that doesn’t work waits until you run sudo and sniffs the password (using pty interception or X event sniffing). The article does correctly note that you can easily accidentally damage your system as root. Given that the skills of typical Linux desktop users are significantly less than those of typical system administrators it seems reasonable to assume that certain risks of mistake which are significant for desktop users aren’t a big deal with skilled sysadmins.

I think that it was a bad decision by the Lindows people to use root for everything due to the risk of errors. If you make a mistake on a desktop system as non-root then if your home directory was backed up recently and you use IMAP or caching IMAP for email access then you probably won’t lose much of note. But if you make a serious mistake as root then the minimum damage is being forced to do a complete reinstall, which is time consuming and annoying even if you have the installation media handy and your Internet connection has enough quota for the month to complete the process.

Finally there are some services that seek out people who use the root account for desktop use. Debian has some support channels on IRC [3] and I decided to use the root account from my SE Linux Play Machine [4] to see how they go. #debian has banned strings matching root. #linpeople didn’t like me because “Deopped you on channel #linpeople because it is registered with channel services“. #linuxhelp and #help let me in, but nothing seemed to be happening in those channels. Last time I tried this experiment I had a minor debate with someone who repeated a mantra about not using root and didn’t show any interest in reading my explanation of why root:user_r:user_t is safe for IRC.

I can’t imagine that good the #debian people expect to gain from denying people the ability to access that channel with an IRC client that reports itself to be running as root. Doing so precludes the possibility of educating them if you think that they are doing something wrong (such as running a distribution like Lindows/Linspire).

Conclusion

I routinely ssh directly to servers as root. I’ve been doing so for as long as ssh has been around and I used to telnet to systems as root before that. Logging in to a server as root without encryption is in most cases a really bad idea, but before ssh was invented it was the only option that was available.

For the vast majority of server deployments I think that there is no good reason to avoid sshing directly as root.

UBAC and SE Linux in Debian

A recent development in SE Linux policy is the concept of UBAC (User Based Access Control) which prevents SE Linux users (identitied) from accessing each other’s files.

SE Linux user identities may map 1:1 to Unix users (as was required in the early versions of SE Linux), you might have unique identities for special users and a default identity for all the other users, or you might have an identity per group – or use some other method of assigning identities to groups.

The UBAC constraints in the upstream reference policy prevent a process with a SE Linux identity other than system_u from accessing any files with an identity other than system_u. So basically any regular user can access files from the system but not other users and system processes (daemons) can access files from all users. Of course this is just one layer of protection, so while the UBAC constraint doesn’t prevent a user from accessing any system files the domain-type access controls may do so.

If you used a unique SE Linux identity for each Unix account then UBAC would prevent any user from accessing a file created by another user.

For my current policy that I am considering uploading to Debian/Unstable I have allowed the identity unconfined_u to access files owned by all identities. This means that unconfined_u is an identity for administrators, if I proceed on this path then I will grant the same rights to sysadm_u.

UBAC was not enabled in Fedora last time I checked, so I’m wondering whether there is any point in including it – I don’t feel obliged to copy everything that Fedora does, but there is some benefit in maintaining compatibility across distributions.

For protecting users from each other it seems that MCS (which is Mandatory in the Debian policy) is adequate. MCS allows a much better level of access control. For example I could assign categories c0 to c10 to a set of different projects and allow the person who manages all the projects to be assigned all those categories when they login. That user could then use the command “runcon -l s0:c1 bash” to start a shell for the purpose of managing files from project 1 and any file or process created by that command would have the category c1 and be prevented from writing to a file with a different category.

Of course the down-side to removing UBAC is that since RBAC was removed there is no other way of separating SE Linux users, while MCS is good for what it does it wasn’t designed for the purpose of isolating different types of user. So I’ll really want to get RBAC reinstalled before Squeeze is released if I remove UBAC.

Regardless of this I will need to get RBAC working on Squeeze eventually anyway. I’ve had a SE Linux Play Machine running with every release of SE Linux for the last 8 years and I don’t plan to stop now.

3

Google Chrome and SE Linux

Google Chrome saying aw snap when it crashes

[107108.433300] chrome[12262]: segfault at bbadbeef ip 0000000000fbea18 sp 00007fffcf348100 error 6 in chrome[400000+27ad000]

When I first tried running the Google Chrome web browser [1] on SE Linux it recursively displayed the error message in the above picture, it first displayed the error and then displayed another error while trying to display a web page to describe the error. The kernel message log included messages such as the above message, it seems that some pointers are initialised to the value 0xbbadbeef to make debugging easier and more amusing.

V8 error: V8 is no longer usable (v8::V8::AddMessageListener())

When I ran Chrome from the command-line it gave the above error message (which was presumably somewhere in the 8MB ~/.xsession-errors file generated from a few hours of running a KDE4 session).

type=AVC msg=audit(1274070733.648:145): avc: denied { execmem } for pid=12833 comm=”chrome” scontext=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 tcontext=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 tclass=process
type=SYSCALL msg=audit(1274070733.648:145): arch=c000003e syscall=9 success=no exit=-131938567594024 a0=7fd863b41000 a1=40000 a2=7 a3=32 items=0 ppid=1 pid=12833 auid=4294967295 uid=1001 gid=1001 euid=1001 suid=1001 fsuid=1001 egid=1001 sgid=1001 fsgid=1001 tty=pts4 ses=4294967295 comm=”chrome” exe=”/opt/google/chrome/chrome” subj=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 key=(null)
type=ANOM_ABEND msg=audit(1274070733.648:146): auid=4294967295 uid=1001 gid=1001 ses=4294967295 subj=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 pid=12833 comm=”chrome” sig=11

V8 is the Google Javascript system which compiles JavaScript code and thus apparently needs read/write/execute access to memory [2]. In /var/log/audit/audit.log I saw the above messages (which would have been in the kernel message log as displayed by dmesg if I didn’t have auditd running). The most relevant parts are that execmem access was requested and that it was by system call 9. From linux/x86_64/syscallent.h in the strace source I discovered that system call 9 on the AMD64 architecture is sys_mmap. Does anyone know a good way to discover which system call has a given number on a particular architecture without reading strace source code?

Attempts to strace the Google Chrome process failed, Chrome gave the error “Failed to move to new PID namespace” after clone() failed. Clone was passed the flag 0x20000000 which according to /usr/include/linux/sched.h is CLONE_NEWPID. It seems that programs which create a new PID namespace (as Google Chrome does) can’t be straced as the clone() call fails. It’s a pity that Chrome doesn’t have an option to run without using this feature, losing the ability to strace it really decreases my ability to find and report bugs in the program – I’m sure that the Google developers want people like me to be able to help them find bugs in their code without undue effort.

Anyway the solution to this problem to allow it to run on the SE Linux Targeted configuration is to run the command “chcon -t unconfined_execmem_exec_t /opt/google/chrome/chrome” which causes the Chrome browser to run in the domain unconfined_execmem_t which is allowed to do such things. Of course we don’t want Chrome processes to run unconfined, I think that the idea I had in 2008 for running Chrome processes in different SE Linux contexts is viable and should be implemented [3].

As a general rule if you are running a program from the command-line on SE Linux with the Targeted configuration (the default and most common configuration) then any time you see an execmem failure logged to the kernel message log or the audit subsystem then you can change the context of the program to unconfined_execmem_exec_t to make the problem go away. Note that this isn’t necessarily a good thing to do, sometimes it’s best to change the program to not require such access. But it seems that in this case the design of V8 requires write/execute memory access to pre-compile JavaScript code.

9

systemd – a Replacement for init etc

The systemd projecct is an interesting concept for replacing init and related code [1]. There have been a few attempts to replace the old init system, upstart is getting some market share in Linux distributions and Solaris has made some interesting changes too.

But systemd is more radical and offers more benefits. While it’s nice to be able to start multiple daemons at the same time with dependencies and doing so offers improvements to the boot times on some systems that really doesn’t lead to optimal boot times or necessarily correct behavior.

Systemd is designed around a similar concept to the wait option in inetd where the service manager (formerly inetd and now the init that comes with systemd) binds to the TCP, UDP, and Unix sockets and then starts daemons when needed. It apparently can start the daemons as needed which means you don’t have a daemon running for months without serving a single request. It also implements some functionality similar to automount which means you can start a daemon before a filesystem that it might need has been fscked.

This means that a large part of the boot process could be performed in reverse. The current process is to run fsck on all filesystems, mount them, run back-end server processes such as database servers and then run servers that need back-end services (EG a web server using a database server). The systemd way would be for process 1 to listen on port 80 and it could then start the web server when a connection is established to port 80, start the database server when a connection is made to the Unix domain socket, and then mount the filesystem when the database server tries to access it’s files.

Now it wouldn’t be a good idea to start all services on demand. Fsck can take hours on some filesystems and is never quick at the best of times. Starting a major daemon such as a database server can also take some time. So a daemon that is known to be necessary for normal functionality and which takes some time to start could be started before a request comes in. As fsck is not only slow but usually has little scope for parallelisation (EG there’s no point running two instances of fsck when you only have one hard disk), so hints as to which filesystem to be checked first would need to be used.

Systemd will require more SE Linux integration than any current init system. There is ongoing debate about whether init should load the SE Linux policy, Debian has init loading the policy while Fedora and Ubuntu have the initramfs do it. Systemd will have to assign the correct SE Linux context to Unix domain socket files and listening sockets for all the daemons that support it (which means that the policy will have to be changed to allow all domains to talk to init). It will also have to manage dbus communication in an appropriate way which includes SE Linux access controls on messages. These features mean that the amount of SE Linux specific code in systemd will dwarf that in sysvinit or Upstart – which among other things means that it really wouldn’t make sense to have an initramfs load the policy.

They have a qemu image prepared to demonstrate what systemd can do. I was disappointed that they prepared the image with SE Linux disabled. All I had to do to get it working correctly was to run the command “chcon -t init_exec_t /usr/local/sbin/systemd” and then configure GRUB to not use “selinux=0” on the kernel command line.

Another idea is to have systemd start up background processes for GUI systems such as KDE and GNOME. Faster startup for KDE and GNOME is a good thing, but I really hope that no-one wants to have process 1 manage this! Having one copy of systemd run as root with PID 1 to start daemons and another copy of the same executable run as non-root with a PID other than 1 to start user background processes is the current design which makes a lot of sense. But I expect that some misguided person will try to save some memory by combining two significantly different uses for process management.

Upgrading a SE Linux system to Debian/Testing (Squeeze)

Upgrade Requirements

Debian/Squeeze (the next release of Debian) will be released some time later this year. Many people are already upgrading test servers, and development systems and workstations that are used to develop code that will be deployed next year. Also there are some significant new features in Squeeze that compel some people to upgrade production systems now (such as a newer version of KVM and Ext4 support).

I’ve started working on an upgrade plan for SE Linux. The first thing you want when upgrading between releases is a way of supporting booting a new kernel independently of the other parts of the upgrade. Either supporting the old user-space with the new kernel or the new kernel with the old user-space. It’s not that uncommon for a new kernel to have a problem when under load so it’s best to be able to back out of a kernel upgrade temporarily while trying to find the cause of the problem. For workstations and laptops it’s not uncommon to have a kernel upgrade not immediately work with some old hardware, this can usually be worked around without much effort, but it’s good to be able to keep systems running while waiting for a response to a support request.

Running a Testing/Unstable kernel with Lenny Policy

deb http://www.coker.com.au lenny selinux

In Lenny the version of selinux-policy-default is 2:0.0.20080702-6. In the above APT repository I have version 2:0.0.20080702-18 which is needed if you want to run a 2.6.32 kernel. The main problem with the older policy is that the devtmpfs filesystem that is used by the kernel for /dev in the early stages of booting [1] is not known and therefore unlabeled – so most access to /dev is denied and booting fails. So before upgrading to testing or unstable it’s a really good idea to install the selinux-policy-default package from my Lenny repository and then run “selinux-policy-upgrade” to apply the new changes (by default upgrading the selinux-policy-default package doesn’t change the policy that is running – we consider the running policy to be configuration files that are not changed unless the user requests it).

There are also some other kernel changes which require policy changes such as a change to the way that access controls are applied to programs that trigger module load requests.

Upgrading to the Testing/Unstable Policy

While some details of the policy are not yet finalised and there are some significant bugs remaining (in terms of usability not security) the policy in Unstable is usable. There is no need to rush an upgrade of the policy, so at this stage the policy in Unstable and Testing is more for testers than for serious production use.

But when you upgrade one thing you need to keep in mind is that we don’t to support upgrading the SE Linux policy between different major versions of Debian while in multi-user mode. The minimum requirement is that after the new policy package is installed you run the following commands and then reboot afterwards:

setenforce 0
selinux-policy-upgrade
touch /.autorelabel

If achieving your security goals requires running SE Linux in enforcing mode all the time then you need to do this in single-user mode.

The changes to names of domains and labeling of files that are entry-points for domains is significant enough that it’s not practical to try and prove that all intermediate states of partial labeling are safe and that there are suitable aliases for all domains. Given that you need to reboot to install a new kernel anyway the reboot for upgrading the SE Linux policy shouldn’t be that much of an inconvenience. The relabel process on the first boot will take some time though.

Running a Lenny kernel with Testing/Unstable Policy

In the original design SE Linux didn’t check open as a separate operation, only read/write etc. The reason for this is that the goal for SE Linux was to control information flows. The open() system call doesn’t transfer any data so there was no need to restrict access to it as a separate operation (but if you couldn’t read or write a file then an attempt to open it would fail). Recent versions of the SE Linux policy have added support for controlling file open, the reason for this is to allow a program in domain A to open a file and then let a program in domain B inherit the file handle and continue using the file even if it is not normally permitted to open the file – this matches the Unix semantics where a privileged process can allow an unprivileged child to inherit file handles or use Unix domain sockets to pass file handles to another process with different privileges.

SELinux: WARNING: inside open_file_mask_to_av with unknown mode:c1b6

Unfortunately when support was added for this a bug was discovered in the kernel, this post to the SE Linux mailing list has the conclusion to a discussion about it [2]. The symptom of this problem is messages such as the above appearing in your kernel message log. I am not planning to build a kernel package for Lenny with a fix for this bug.

The command “dmesg -n 1” will prevent such messages from going to the system console – which is something you want to do if you plan to login at the console as they can occur often.

3

Xen and Debian/Squeeze

Ben Hutchings announced that the Debian kernel team are now building Xen flavoured kernels for Debian/Unstable [1]. Thanks to Max Attems and the rest of the kernel team for this and all their other great work! Thanks Ben for announcing it. The same release included OpenVZ, updated DRM, and the kernel mode part of Nouveau – but Xen is what interests me most.

I’ve upgraded the Xen server that I use for my SE Linux Play Machine [2] to test this out.

To get this working you first need to remove xen-tools as the Testing version of bash-completion has an undeclared conflict, see Debian bug report #550590.

Then you need to upgrade to Unstable, this requires upgrading the kernel first as udev won’t upgrade without it.

If you have an existing system you need to install xen-hypervisor-3.4-i386 and purge xen-hypervisor-3.2-1-i386 as the older Xen hypervisor won’t boot the newer kernel. This also requires installing xen-utils-3.4 and removing xen-utils-3.2-1 as the utilities have to match the kernel. You don’t strictly need to remove the old hypervisor and utils packages as it should be possible to have dual-boot configured with old and new versions of Xen and matching Linux kernels. But this would be painful to manage as update-grub doesn’t know how to match Xen and Linux kernel versions so you will get Grub entries that are not bootable – it’s best to just do a clean break and keep a non-Xen version of the older kernel installed in case it doesn’t initially boot.

A apt-get dist-upgrade operation will result in installing the grub-pc package. The update-grub2 command doesn’t generate Xen entries. I’ve filed Debian bug report #574666 about this.

Because the Linux kernel doesn’t want to reduce in size to low values I use “xenhopt=dom0_mem=142000” in my GRUB 0.98 configuration so that the kernel doesn’t allocate as much RAM to it’s internal data structures. In the past I’ve encountered a kernel memory management bug related to significantly reducing the size of the Dom0 memory after boot [3].

Before I upgraded I had the dom0_mem size set to 122880 but when running Testing that seems to get me a kernel Out Of Memory condition from udev in the early stages of boot which prevents LVM volumes from being scanned and therefore prevents swap from being enabled so the system doesn’t work correctly (if at all). I had this problem with 138000M of RAM so I chose 142000 as a safe number. Now I admit that the system would probably boot with less RAM if I disabled SE Linux, but the SE Linux policy size of the configuration I’m using in the Dom0 has dropped from 692K to 619K so it seems likely that the increase in required memory is not caused by SE Linux.

The Xen Dom0 support on i386 in Debian/Unstable seems to work quite well. I wouldn’t recommend it for any serious use, but for something that’s inherently designed for testing (such as a SE Linux Play Machine) then it works well. My Play Machine has been offline for the last few days while I’ve been working on it. It didn’t take much time to get Xen working, it took a bit of time to get the SE Linux policy for Unstable working well enough to run Xen utilities in enforcing mode, and it took three days because I had to take time off to work on other projects.

Play Machine Online Again

I have returned from the US and my SE Linux Play Machine [1] is online again.

It was unfortunate that I forgot to pack one of my Play machine shirts, I ended up attending a meeting of the SDForum [2] on the topic of Cloud Security (it was a joint meeting of the Cloud Services and Security SIGs) and it would have been good to have been wearing a root password.

Play Machine Offline for 2 Weeks

I’m about to leave for San Francisco, so my SE Linux Play Machine is turned off and will remain off until after I return.