Archives

Categories

Xen and Linux Memory Assignment Bugs

The Linux kernel has a number of code sections which look at the apparent size of the machine and determine what would be the best size for buffers. For physical hardware this makes sense as the hardware doesn’t change at runtime. There are many situations where performance can be improved by using more memory for buffers, enabling large buffers for those situations when the machine has a lot of memory makes it convenient for the sysadmin.

Virtual machines change things as the memory available to the kernel may change at run-time. For Xen the most common case is the Dom0 automatically shrinking when memory is taken by a DomU – but it also supports removing memory from a DomU via the xm mem-set command (the use of xm mem-set seems very rare).

Now a server that is purchased for the purpose of running Xen will have a moderate amount of RAM. In recent times the smallest machine I’ve seen purchased for running Xen had 4G of RAM – and it has spare DIMM slots for another 4G if necessary. While a non-virtual server with 8G of RAM would be an unusually powerful machine dedicated for some demanding application, a Xen server with 8G or 16G of RAM is not excessively big, it’s merely got space for more DomU’s. For example one of my Xen servers has 8 CPU cores, 8G of RAM, and 14 DomUs. Each DomU has on average just over half a gig of RAM and half of a CPU core – not particularly big.

In a default configuration the Dom0 will start by using all the RAM in the machine, which in this case meant that the buffer sizes were appropriate for a machine with 8G of RAM. Then as DomUs are started memory is removed from the Dom0 and these buffers become a problem. This ended up forcing a reboot of the machine by preventing Xen virtual network access to most of the DomUs. I was seeing many messages in the Dom0 kernel message log such as “xen_net: Memory squeeze in netback driver” and most DomUs were inaccessible from the Internet (I didn’t verify that all DomUs were partially or fully unavailable or test the back-end network as I was in a hurry to shut it down and reboot before too many customers complained).

The solution to this is to have the Dom0 start by using a small amount of RAM. To do this I edited the GRUB configuration file and put “dom0_mem=256000” at the end of the Xen kernel line (that is the line starting with “kernel /xen.gz“). This gives the Dom0 kernel just under 256M of RAM from when it is first loaded and prevents allocation of bad buffer sizes, it’s the only solution to this network problem that a quick Google search (the kind you do when trying to fix a serious outage before your client notices (*)) could find.

One thing to note is that my belief that it’s kernel buffer sizes that are at the root cause of this problem is based on my knowledge of how some of the buffers are allocated plus an observation of the symptoms. I don’t have a test machine with anything near 8G of RAM so I really can’t do anything more to track this down.

There is another benefit to limiting the Dom0 memory, I have found that on smaller machines it’s impossible to reduce the Dom0 memory below a certain limit at run-time. In the past I’ve had problems in reducing the memory of a Dom0 below about 250M, while such a reduction is hardly desirable on a machine with 8G of RAM, when running an old P3 machine with 512M of RAM there are serious benefits to making Dom0 smaller than that. As a general rule I recommend having a limit on the memory of the Dom0 on all Xen servers. If you use the model of having no services running on the Dom0 there is no benefit in having much ram assigned to it.

(*) Hiding problems from a client is a bad idea and is not something I recommend. But being able to fix a problem and then tell the client that it’s already fixed is much better than having them call you when you don’t know how long the fix will take.

Play Machine Downtime

From the 13th to the 14th of August my Play Machine [1] was offline. There was a power failure for a few seconds and the machine didn’t boot correctly. As I had a lot of work to do I left it offline for a day before fixing it. The reason it didn’t boot was that due to an issue with the GRUB package it was trying to boot a non-Xen kernel with Xen, this would cause the Xen Dom0 load to abort and it would then reboot after 5 seconds – and automatically repeat the process. The problem is that update-grub in Lenny will generate boot entries for Xen kernels to boot without Xen and for non-Xen kernels to boot with Xen.

Two days ago someone launched a DOS attack on my Play Machine and I’ve only just put it back online. I’ve changed the ulimit settings a bit, that won’t make DOS attacks impossible, just force the attacker to use a little bit more effort.

AppArmor is Dead

For some time there have been two mainstream Mandatory Access Control (MAC) [1] systems for Linux. SE Linux [2] and AppArmor [3].

In late 2007 Novell laid off almost all the developers of AppArmor [4] with the aim of having the community do all the coding. Crispin Cowan (the founder and leader of the AppArmor project) was later hired by Microsoft, which probably killed the chances for ongoing community development [5]. Crispin has an MSDN blog, but with only one post so far (describing UAC) [6], hopefully he will start blogging more prolifically in future.

Now SUSE is including SE Linux support in OpenSUSE 11.1 [7]. They say that they will not ship policies and SE Linux specific tools such as “checkpolicy”, but instead they will be available from “repositories”. Maybe this is some strange SUSE thing, but for most Linux users when something is in a “repository” then it’s shipped as part of the distribution. The SUSE announcement also included the line “This is particularly important for organizations that have already standardized on SELinux, but could not even test-drive SUSE Linux Enterprise before without major work and changes“. The next step will be to make SE Linux the default and AppArmor the one that exists in a repository, and the step after that will be to remove AppArmor.

In a way it’s a pity that AppArmor is going away so quickly. The lack of competition is not good for the market, and homogenity isn’t good for security. But OTOH this means more resources will be available for SE Linux development which will be a good thing.

Update: I’ve written some more about this topic in a later post [8].

Switches and Cables

I’ve just read an amusing series of blog posts about bad wiring [1]. I’ve seen my share of wiring horror in the past. There are some easy ways of minimising wiring problems which seem to never get implemented.

The first thing to do is to have switches near computers. Having 48 port switches in a server room and wires going across the building causes mess and is difficult to manage. A desktop machine doesn’t need a dedicated Gig-E (or even 100baseT) connection to the network backbone. Cheap desktop switches installed on desks allow one cable to go to each group of desks (or two cables if you have separate networks for VOIP and data). If you have a large office area then a fast switch in the corner of the room connecting to desktop switches on the desks is a good way to reduce the cabling requirements. The only potential down-side is that some switches are noisy, the switches with big fans can be easily eliminated by a casual examination, but the ones that make whistling sounds from the PSU need to be tested first. The staff at your local electronics store should be very happy to open one item for inspection and plug it in if you are about to purchase a moderate number (they will usually do so even if you are buying a single item).

A common objection to this is the perceived lack of reliability of desktop switches. One mitigating factor is that if a spare switch is available the people who work in the area can replace a broken switch. Another is that my observation is that misconfiguration on big expensive switches causes significantly more down-time than hardware failures on cheap switches ever could. A cheap switch that needs to be power-cycled once a month will cause little interruption to work, while a big expensive switch (which can only be configured by the “network experts” – not regular sysadmins such as me) can easily cause an hour of down-time for most of an office during peak hours. Finally the reliability of the cables themselves is also an issue, having two cables running to the local switch in every office can allow an easy replacement to fix a problem – it can be done without involving the IT department (who just make sure that both cables are connected to the switch in the server room). If there is exactly one cable running to each PC from the server room and one of the cables fails then someone’s PC will be offline for a while.

In server rooms the typical size of a rack is 42RU (42 Rack Units). If using 1RU servers that means 42 Ethernet cables. A single switch can handle 48 Ethernet ports in a 1RU mount (for the more dense switches), others have 24 ports or less. So a single rack can handle 41 small servers and a switch with 48 ports (two ports to go to the upstream switch and five spare ports). If using 2RU servers a single rack could handle 20 servers and a 24port switch that has two connections to the upstream switch and two spare ports. Also it’s generally desirable to have at least two Ethernet connections to each server (public addresses and private addresses for connecting to databases and management). For 1RU servers you could have two 48 port switches and 40 servers in a rack. For 2RU servers you could have 20 servers and either two 24port switches or one 48port switch that supports VLANs (I prefer two switches – it’s more difficult to mess things up when there are two switches, if one switch fails you can login via the other switch to probe it, and it’s also cheaper). If the majority of Ethernet cables are terminated in the same rack it’s much harder for things to get messed up. Also it’s very important to leave some spare switch ports available as it’s a common occurrence for people to bring laptops into a server room to diagnose problems and you really don’t want them to unplug server A to diagnose a problem with server B…

Switches should go in the middle of the rack. While it may look nicer to have the switch at the top or the bottom, that means that the server which is above or below it will have the cables for all the other switches going past it. Ideally the cables would go in neat cable runs at the side of the rack but in my experience they usually end up just dangling in front. If the patch cables are reasonably short and they only dangle across half the servers things won’t get too ugly (this is harm minimisation in server room design).

The low end of network requirements is usually the home office. My approach to network design for my home office is quite different, I have no switches! I bought a bunch of dual-port Ethernet cards and now every machine that I own has at least two Ethernet ports (and some have as many as four). My main router and gateway has four ports which allows connections from all parts of my house. Then every desktop machine has at least two ports so that I can connect a laptop in any part of the house. This avoids the energy use of switches (I previously used a 24 port switch that drew 45W [2]), switches of course also make some noise and are an extra point of failure. While switches are more reliable than PCs, as I have to fix any PC that breaks anyway my overall network reliability is increased by not using switches.

For connecting the machines in my home I mostly use bridging (only the Internet gateway acts as a router), I have STP enabled on all machines that have any risk of having their ports cross connected but disable it on some desktop machines with two ports (so that I can plug my EeePC in and quickly start work for small tasks).

Purging an RT Database

I had a problem where the email address circsales@washpost.com spammed a Request Tracker (RT) [1] installation (one of the rules for running a vaction program is that you never respond twice to the same address, another rule is that you never respond to automatically generated messages).

Deleting these tickets was not easy, the RT web interface only supports deleting 50 tickets at a time.

To delete them I first had to find the account ID in RT, the following query does that:
select id from Users where EmailAddress='circsales@washpost.com';

Then to mark the tickets as deleted I ran the following SQL command (where X was the ID):
update Tickets set Status='deleted' where Creator=X;

Finally to purge the deleted entries from the database (which was growing overly large) I used the RTx-Shredder [2] tool. RTx-Shredder doesn’t seem to support deleting tickets based on submitter, which is why I had to delete them first.

I am currently using the following command to purge the tickets. The “limit,500” directive tells rtx-shredder to remove 500 tickets at one time (the default is to only remove 10 tickets).
./sbin/rtx-shredder --force --plugin 'Tickets=status,deleted;limit,500'

There are currently over 34,000 deleted tickets to remove, and rtx-shredder is currently proceeding at a rate of 9 tickets per minute, so it seems that it will take almost three days of database activity to clear the tickets out.

I also need to purge some tickets that have been resolved for a long time, I’m running the following command to remove them:
./sbin/rtx-shredder --force --plugin 'Tickets=status,resolved;updated_before,2008-03-01 01:01:34;limit,500'

With both the rtx-shredder commands running at once I’m getting a rate of 15 tickets per minute, so it seems that the bottleneck is more related to rtx-shredder than MySQL (which is what I expected). Although with two copies running at once I have mysqld listed as taking about 190% of CPU (two CPUs running two capacity). The machine in question has two P4 CPUs with hyper-threading enabled, so maybe running two copies of rtx-shredder causes mysqld to become CPU bottlenecked. I’m not sure how to match up CPU use as reported via top to actual CPU power in a system with hyper-threading (the hyper-threaded virtual CPUs do not double the CPU power). I wonder if this means that the indexes on the RT tables are inadequate to the task.

I tried adding the following indexes (as suggested in the rtx-shredder documentation), but it didn’t seem to do any good – it might have improved performance by 10% but that could be due to sampling error.

CREATE INDEX SHREDDER_CGM1 ON CachedGroupMembers(MemberId, GroupId, Disabled);
CREATE INDEX SHREDDER_CGM2 ON CachedGroupMembers(ImmediateParentId, MemberId);
CREATE UNIQUE INDEX SHREDDER_GM1 ON GroupMembers(MemberId, GroupId);
CREATE INDEX SHREDDER_TXN1 ON Transactions(ReferenceType, OldReference);
CREATE INDEX SHREDDER_TXN2 ON Transactions(ReferenceType, NewReference);
CREATE INDEX SHREDDER_TXN3 ON Transactions(Type, OldValue);
CREATE INDEX SHREDDER_TXN4 ON Transactions(Type, NewValue);

DNS Secondaries and Web Security

At the moment there are ongoing security issues related to web based services and DNS hijacking. the Daily Ack has a good summary of the session hijacking issue [1].

For a long time it has been generally accepted that you should configure a DNS server to not allow random machines on the Internet to copy the entire zone. Not that you should have any secret data there anyway, but it’s regarded as just a precautionary layer of security by obscurity.

Dan Kaminsky (who brought the current DNS security issue to everyone’s attention) has described some potential ways to alleviate the problem [2]. One idea is to use random case in DNS requests (which are case insensitive but case preserving), so if you were to lookup wWw.cOkEr.CoM.aU and the result was returned with different case then you would know that it was forged.

Two options which have been widely rejected are using TCP for DNS (which is fully supported for the case where an answer can not fit in a single UDP packet) and sending requests twice (to square the number of combinations that would need to be guessed). They have been rejected due to the excessive load on the servers (which are apparently already near capacity).

One option that does not seem to get mentioned is the possibility to use multiple source IP addresses, so instead of merely having 2^16 ports to choose from you could multiply that by as many IP addresses as you have available. In the past I’ve worked for ISPs that could have dedicated a /22 (1024 IP addresses) to their DNS proxy if it would have increased the security of their customers – an ISP of the scale that has 1024 spare IP addresses available is going to be a major target of such attacks! Also with some fancy firewall/router devices it would not be impossible to direct all port 53 traffic through the DNS proxies. That would mean that an ISP with 200,000 broadband customers online could use a random IP address from that pool of 200,000 IP addresses for every DNS request. While attacking a random port choice out of 65500 ports is possible, if it was 65500 ports over a pool of 200,000 IP addresses it would be extremely difficult (I won’t claim it to be impossible).

One problem with the consideration that has been given to TCP is that it doesn’t account for the other uses of TCP, such as for running DNS secondaries.

In Australia we have two major ISPs (Telstra and Optus) and four major banks (ANZ, Commonwealth, NAB, and Westpac). It shouldn’t be difficult for arrangements to be made for the major ISPs to have their recursive DNS servers (the caching servers that their customers talk to) act as slaves for the DNS zones related to those four banks (which might be 12 zones or more given the use of different zones for stock-broking etc). If that was combined with a firewall preventing the regular ISP customers (the ones who are denied access to port 25 to reduce the amount of spam) from receiving any data from the Internet with a source port of 53 then the potential for attacks on Australian banks would be dramatically decreased. I note that the Westpac bank has DNS secondaries run by both Optus and Telstra (which makes sense for availability reasons if nothing else), so it seems that the Telstra and Optus ISP services could protect their customers who use Westpac without any great involvement from the bank.

Banks have lots of phone lines and CTI systems. It would be easy for each bank to have a dedicated phone number (which is advertised in the printed phone books, in the telephone “directory assistance” service, and in brochures available in bank branches – all sources which are more difficult to fake than Internet services) which gave a recorded message of a list of DNS zone names and the IP addresses for the master data. Then every sysadmin of every ISP could mirror the zones that would be of most use to their customers.

Another thing that banks could do would be to create a mailing list for changes to their DNS servers for the benefit of the sysadmins who want to protect their customers. Signing mail to such a list with a GPG key and having the fingerprint available from branches should not be difficult to arrange.

Another possibility would be to use the ATM network to provide security relevant data. Modern ATMs have reasonably powerful computers which are used to display bank adverts when no-one is using them. Having an option to press a button on the ATM to get a screen full of Internet banking security details of use to a sysadmin should be easy to implement.

For full coverage (including all the small building societies and credit unions) it would be impractical for every sysadmin to have a special case for every bank. But again there is a relatively easy solution. A federal agency that deals with fraud could maintain a list of zone names and master IP addresses for every financial institution in the country and make it available on CD. If the CD was available for collection from a police station, court-house, the registry of births, deaths, and marriages, or some other official government office then it should not have any additional security risks. Of course you wouldn’t want to post such CDs, even with public key signing (which many people don’t check properly) there would be too much risk of things going wrong.

In a country such as the US (which has an unreasonably large number of banks) it would not be practical to make direct deals between ISPs and banks. But it should be practical to implement a system based on a federal agency distributing CDs with configuration files for BIND and any other DNS servers that are widely used (is any other DNS server widely used?).

Of course none of this would do anything about the issue of Phishing email and typo domain name registration. But it would be good to solve as much as we can.

Ownership of the Local SE Linux Policy

A large part of the disagreement about the way to manage the policy seems to be based on who will be the primary “owner” of the policy on the machine. This isn’t a problem that only applies to SE Linux, the same issue applies for various types of configuration files and scripts throughout the process of distribution development. Having a range of modules which can be considered configuration data that come from a single source seems to make SE Linux policy unique among other packages. The reasons for packaging all Apache modules in the main package seem a lot clearer.

One idea that keeps cropping up is that as the policy is modular it should be included in daemon packages and the person maintaining the distribution package of the policy should maintain it. The reason for this request seems to usually be based on the idea that the person who packages a daemon for a distribution knows more about how it works than anyone else, I believe that this is false in most cases. When I started working on SE Linux I had a reasonable amount of experience in maintaining Debian packages of daemons and server processes, but I had to learn a lot about how things REALLY work to be able to write good policy. Also if we were to have policy modules included in the daemon packages, then those packages would need to be updated whenever there were serious changes to the SE Linux policy. For example Debian/Unstable flip-flopped on MCS support recently, changing the policy packages to re-enable MCS was enough pain, getting 50 daemon packages updated would have been unreasonably painful. Then of course there is the case where two daemons need to communicate, if the interface which is provided with one policy module has to be updated before another module can be updated and they are in separate packages then synchronised updates to two separate packages might be required for a single change to the upstream policy. I believe that the idea of having policy modules owned by the maintainers of the various daemon packages is not viable. I also believe that most people who package daemons would violently oppose the idea of having to package SE Linux policy if they realised what would be required of them.

Caleb Case seems to believe that ownership of policy can either be based on the distribution developer or the local sys-admin with apparently little middle-ground [1]. In the section titled “The Evils of Single Policy Packages” he suggests that if an application is upgraded for a security fix, and that upgrade requires a new policy, then it requires a new policy for the entire system if all the policy is in the same package. However the way things currently work is that upgrading a Debian SE Linux policy package does not install any of the new modules. They are stored under /usr/share/selinux/default but the active modules are under /etc/selinux/default/modules/active. An example of just such an upgrade is the Debian Security Advisory DSA-1617-1 for the SE Linux policy for Etch to address the recent BIND issue [2]. In summary the new version of BIND didn’t work well with the SE Linux policy, so an update was released to fix it. When the updated SE Linux policy package is installed it will upgrade the bind.pp module if the previous version of the package was known to have the version of bind.pp that didn’t allow named to bind() to most UDP ports – the other policy modules are not touched. I think that this is great evidence to show that the way things currently work in Debian work well. For the hypothetical case where a user had made local modifications to the bind.pp policy module, they could simply put the policy package on hold – I think it’s safe to assume that anyone who cares about security will read the changelogs for all updates to released versions of Debian, so they would realise the need to do this.

Part of Caleb’s argument rests on the supposed need for end users to modify policy packages (IE to build their own packages from modified source). I run many SE Linux machines, and since the release of the “modular” policy (which first appeared in Fedora Core 5, Debian/Etch, and Red Hat Enterprise Linux 5) I have never needed to make such a modification. I modify policy regularly for the benefit of Debian users and have a number of test machines to try it out. But for the machines where I am a sysadmin I just create a local module that permits the access that is needed. The only reason why someone would need to modify an existing module is to remove privileges or to change automatic domain transition rules. Changing automatic domain transitions is a serious change to the policy which is not something that a typical user would want to do – if they were to do such things then they would probably grab the policy source and rebuild all the policy packages. Removing privileges is not something that a typical sysadmin desires, the reference policy is reasonably strict and users generally don’t look for ways to tighten up the policy. In almost all cases it seems best to consider that the policy modules which are shipped by the distribution are owned by the distribution not the sysadmin. The sysadmin will decide which policy modules to load, what roles and levels to assign to users with the semanage tool, and what local additions to add to the policy. For the CentOS systems I run I use the Red Hat policy, I don’t believe that there is a benefit for me to change the policy that Red Hat ships, and I think that for people who have less knowledge about SE Linux policy than me there are more reasons not to change such policy and less reasons to do so.

Finally Caleb provides a suggestion for managing policy modules by having sym-links to the modules that you desire. Of course there is nothing preventing the existence of a postfix.pp file on the system provided by a package while there is a local postfix.pp file which is the target of the sym-link (so the sym-link idea does not support the idea of having multiple policy packages). With the way that policy modules can be loaded from any location, the only need for sym-links is if you want to have an automatic upgrade script that can be overridden for some modules. I have no objection to adding such a feature to the Debian policy packages if someone sends me a patch.

Caleb also failed to discuss how policy would be initially loaded if packaged on a per-module basis. If for example I had a package selinux-policy-default-postfix which contains the file postfix.pp, how would this package get installed? I am not aware of the Debian package dependencies (or those of any other distribution) being about to represent that the postfix package depends on selinux-policy-default-postfix if and only if the selinux-policy-default package is installed. Please note that I am not suggesting that we add support for such things, a package management system that can solve Sudoku based on package dependency rules is not something that I think would be useful or worth having. As I noted in my previous post about how to package SE Linux policy for distributions [3] the current Debian policy packages have code in the postinst (which I believe originated with Erich Schubert) to load policy modules that match the Debian packages on the system. This means that initially setting up the policy merely requires installing the selinux-policy-default package and rebooting. I am inclined to reject any proposed change which makes the initial install of of the policy more difficult than this.

After Debian/Lenny is released I plan to make some changes to the policy. One thing that I want to do is to have a Debconf option to allow users to choose to automatically upgrade their running policy whenever they upgrade the Debian policy package, this would probably only apply to changes within one release (IE it wouldn’t cause an automatic upgrade from Lenny+1 policy to Lenny+2). Another thing I would like to do is to have the policy modules which are currently copied to /etc/selinux/default/modules/active instead be hard linked when the source is a system directory. That would save about 12M of disk space on some of my systems.

I’ve taken the unusual step of writing two blog posts in response to Caleb’s post not because I want to criticise him (he has done a lot of good work), but because he is important in the SE Linux community and his post deserves the two hours I have spent writing responses to it. While writing these posts I have noticed a number of issues that can be improved, I invite suggestions from Caleb and others on how to make such improvements.

SE Linux Policy Packaging for a Distribution

Caleb Case (Ubuntu contributer and Tresys employee) has written about the benefits of using separate packages for SE Linux policy modules [1].

Firstly I think it’s useful to consider some other large packages that could be split into multiple packages. The first example that springs to mind is coreutils which used to be textutils, shellutils, and fileutils. Each of those packages contained many programs and could conceivably have been split. Some of the utilities in that package are replaced for most use, for example no-one uses the cksum utility, generally md5sum and sha1sum (which are in the same package) are used instead. Also the pinky command probably isn’t even known by most users who use finger instead (apart from newer Unix users who don’t even know what finger is). So in spite of the potential benefit of splitting the package (or maintaining the previous split) it was decided that it would be easier for everyone to have a single package. The merge of the three packages was performed upstream, but there was nothing preventing the Debian package maintainer from splitting the package – apart from the inconvenience to everyone. The coreutils package in Etch takes 10M of disk space when installed, as it’s almost impossible to buy a new hard drive smaller than 80G that doesn’t seem to be a problem for most users.

The second example is the X server which has separate packages for each video card. One thing to keep in mind about the X server is that the video drivers don’t change often. While it is quite possible to remove a hard drive from one machine and install it in another, or duplicate a hard drive to save the effort of a re-install (I have done both many times) they are not common operations in the life of a system. Of course when you do require such an update you need to first install the correct package (out of about 60 choices), which can be a challenge. I suspect that most Debian systems have all the video driver packages installed (along with drivers for wacom tablets and other hardware devices that might be used) as that appears to be the default. So it seems likely that a significant portion of the users have all the packages installed and therefore get no benefit from the split package.

Now let’s consider the disk space use of the selinux-policy-default package – it’s 24M when installed. Of that 4.9M is in the base.pp file (the core part of the policy which is required), then there’s 848K for the X server (which is going to be loaded on all Debian systems that have X clients installed – due to an issue with /tmp/.ICE-unix labelling [2]). Then there’s 784K for the Postfix policy (which is larger than it needs to be – I’ve been planning to fix this for the past four years or so) and 696K for the SSH policy (used by almost everyone). The next largest is 592K for the Unconfined policy, the number of people who choose not to use this will be small, and as it’s enabled by default it seems impractical to provide a way of removing it.

One possibility for splitting the policy is to create a separate package of modules used for the less common daemons and services, if modules for INN, Cyrus, distcc, ipsec, kerberos, ktalk, nis, PCMCIA, pcscd, RADIUS, rshd, SASL, and UUCP were in a separate package then that would reduce the installed size of the main package by 1.9M while providing no change in functionality to the majority of users.

One thing to keep in mind is that each package at a minimum will have a changelog and a copyright file (residing in a separate directory under /usr/share/doc) and three files as part of the dpkg data store, each of which takes up at least one allocation unit on disk (usually 4K). So adding one extra package will add at least 24K of disk space to every system that installs it (or 32K if the package has postinst and postrm scripts). This is actually a highly optimal case, the current policy packages (selinux-policy-default and selinux-policy-mls) each take 72K of disk space for their doc directory.

One of my SE Linux server sytems (randomly selected) has 23 policy modules installed, if they were in separate packages there would be a minimum of 552K of disk space used by packaging, 736K if there were postinst and postrm scripts, and as much as 2M if the doc directory for each package was similar to the current doc directories). As the system in question needs 5796K of policy modules, the 2M of overhead would make it approach 8M of disk space. So it would only be a saving of 16M over the current situation. While saving that amount of disk space is a good thing, I think that when balanced against the usability issues it’s not worth-while.

Currently the SE Linux policy packages will determine what applications are installed and automatically load policy packages to match. I don’t believe that it’s possible to have a package post-inst script install other packages (and if it is possible I don’t think it’s desirable). Therefore to have separate packages would make a significant difference to the ease of use, it seems that the best way to manage it would be to have the core policy package include a script to install the other packages.

Finally there’s the issue of when you recognise the need for a policy module. It’s not uncommon for me to do some work for a client while on a train, bus, or plane journey. I will grab packages needed to simulate a configuration that the client desires and then work out how to get it going correctly while on the journey. While it would not be a problem for me (I always have the SE Linux policy source and all packages on hand) I expect that many people who have similar needs might find themself a long way from net access without the policy package that they need to do their work. Sure such people could do their work in permissive mode, but that would encourage them to deploy in permissive mode too and thus defeat the goals of the SE Linux project (in terms of having wide-spread adoption).

My next post on this topic will cover the issue of custom policy.

Updated to note that Caleb is a contributor to Ubuntu not a developer.

Australian Business and IT Expo

I’ve just visited the Australian Business and IT Expo (ABITE) [1]. I haven’t been to such an event for a while, but Peter Baker sent a link for a free ticket to the LUV mailing list and I was a bit bored to I attended.

The event was a poor shadow on previous events that I had attended. The exhibition space was shared with an event promoting recreational activities for retirees, and an event promoting wine and gourmet food. I’m not sure why the three events were in the room, maybe they figured that IT people and senior citizens both like gourmet food and wine.

The amount of space used for the computer stands was small, and there was no great crowd of delegates – when they can’t get a good crowd for Saturday afternoon it’s a bad sign for the show.

I have previously blogged about the idea of putting advertising on people’s butts [2]. One company had two women working on it’s stand with the company’s name on the back of their shorts.

A representative of a company in the business of Internet advertising asked me how many hits I get on my blog, I told him 2,000 unique visitors a month (according to Webalizer) which seemed to impress him. Actually it’s about 2,000 unique visitors a day. I should rsync my Webalizer stats to my EeePC so I can give detailed answers to such questions.

The IT event seemed mostly aimed at managers. There were some interesting products on display, one of which was a device from TabletPC.com.au which had quite good handwriting recognition (but the vocabulary seemed limited as it couldn’t recognise a swear-word I used as a test).

Generally the event was fun (including the wine and cheese tasting) and I don’t regret going. If I had paid $10 for a ticket I probably would have been less happy with it.

Updated to fix the spelling of “wine”. Not a “wind tasting”.

Starting to Blog

The best way to run a blog is to run your own blog server. This can mean running an instance on someone else’s web server (some ISPs have special hosting deals for bloggers on popular platforms such as WordPress), but usually means having shell access to your own server (I’ve previously written about my search for good cheap Xen hosting [1]).

There are platforms that allow you to host your own blog without any technical effort. Three popular ones are WordPress.com, LiveJournal.com, and Blogger.com. But they give you less control over your own data, particularly if you don’t use your own DNS name (blogger allows you to use their service with your own DNS name).

Currently it seems to me that WordPress is the best blog software by many metrics. It has a good feature set, a plugin interface with lots of modules available, and the code is free. The down-side is that it’s written in PHP and has the security issues that tend to be associated with large PHP applications.

Here is a good summary of the features of various blog server software [2]. One that interests me is Blojsom – a blog server written in Java [3]. The Java language was designed in a way that leads to less risk of security problems than most programming languages, as it seems unlikely that anyone will write a Blog server in Ada it seems that Java is the best option for such things. I am not planning to switch, but if I was starting from scratch I would seriously consider Blojsom.

But for your first effort at blogging it might be best to start with one of the free hosted options. You can always change later on and import the old posts into your new blog. If you end up not blogging seriously then using one of the free hosted services saves you the effort of ongoing maintenance.