10

Software Development is a Team Sport

Albert writes about software development and how much teamwork is used [1]. He makes an interesting clash of analogies by suggesting that it’s not a “team sport” because “its not like commercial fishing where many hands are used to pull in the net at the same time“.

I think that software development for any non-trivial project is a team sport. You don’t have the same level of direct coordination as required for pulling in a net (or the rugby scrum [2] to use a sporting analogy), but that doesn’t stop it being a team effort.

Some parts of team development projects are like a relay event, in corporate environments the work is done in parallel simply because everyone is working the same hours but in free software development projects the work is often serialised. I think that it’s often more effective to serialise some work, if someone is actively working on one sections of code it may save time to avoid working in that area until they are finished. There is little benefit in writing code to old interfaces.

Some parts of team projects have specialised skill areas (EG debugging, skills in particular programming languages, and graphical design). Soccer is one sport where different rules apply to different players (the goal keeper can use their hands). In ice-hockey the protective clothing used by the goal keeper is considerably different from that used by other players. In most team sports where the aim is to put a ball through a goal at one end (EG basketball and all versions of football) there seems to be some degree of specialisation, some players are dedicated to scoring goals while others are dedicated to defense. The fielding team in cricket has every player assigned to a different part of the field – with slight differences in the skills required.

Then there is the issue of large projects such as Linux distributions. It seems to me that a Linux distribution will always comprise multiple teams as well as some individual projects. Maybe we could consider Linux distributions (and distributions of the other free OSs) to be similar to countries that compete in the Olympics. The culture of the Free Software (or Open Source if that’s your idea) community can be compared to the Olympic Spirit. Of course the Olympic idea that people should come together in peace for the Olympic Games and that it’s about honor not money is pretty much dead.

Maybe the Free Software development processes should be compared to an ideal of what sporting contests would be if there weren’t unreasonable amounts of money (and therefore corruption) involved.

Of course no analogy is perfect and there are many ways in which this one breaks down. One of which is the cooperation between distributions. There is a lot of private discussion between developers of various distributions and upstream developers about how to plan new features. It’s not uncommon for developers to announce certain development decisions as soon as they are made to help other distributions make decisions – for a developer in a distribution project if there is an issue which doesn’t matter much to you or your users then it’s often good to strive for compatibility with other distributions.

When users advocate new features or changes they sometimes try multiple distributions. It’s not uncommon for a feature request to be rejected by one distribution and then accepted by another. Once a feature is included in a major distribution the upstream developer is more likely to accept it due to it’s wide testing. Then when the feature is in upstream it’s almost certain to be included in all other distributions. I often recommend that when someone disagrees with one of their bugs being closed as “not a bug” that they try reproducing it in another distribution and reporting it there. As a side note, the criteria for reporting a bug in any free software distribution is that you can describe it in a way that allows other people to reproduce it – whether it’s a bug that afflicts you every day or whether you installed the distribution for the sole purpose of reporting the bug in a new forum is not relevant. As a general rule I recommend that you not have the same bug report open in more than one distribution at any time (if you notice a bug reported in multiple distributions then please add a note to each bug report so that work can be coordinated). As a general rule the only situation where I will open the same bug in multiple forums is if I have been told that the responsible person or people in one forum are unwilling or unable to fix it.

Finally, the people who consider that they don’t need to be a team player because they do their coding alone might want to consider Qmail. Dan Bernstein is a great coder and Qmail is by most metrics a fine piece of software, in terms of security Qmail is as good as it gets! If Dan was more of a team player then I believe that his mail server would have been much more successful (in terms of the number of sites using it). However I do understand his desire to have a great deal of control over his software.

4

Debian Work and Upstream

Steve Kemp writes about security issues with C programs [1]. It seems obvious that if you are going to do something that is overly tricky (such as anything related to setuid programs) then you should have a good knowledge of what you are doing. Steve goes a little further and suggests that anyone who doesn’t know C should not package a C program.

Andrew Pollock is a Debian developer who doesn’t program in C and yet packages some C programs, he explains why he thinks that it’s correct to forward C coding bugs upstream [2].

In this debate I have more sympathy for Steve’s position. Security is one of the most important things in software that we develop, if you can’t develop secure software then IMHO you shouldn’t distribute any software. Although it should be noted that the two posts in question don’t directly conflict. Steve’s main point (your should know the code well to do something relevant to security) does not directly conflict with Andrew’s main point that a non-C coder can maintain the Debian packaging files and forward bug reports related to C code upstream.

In the more general case it’s often impossible to know all languages used in a project. It’s not uncommon for a single project to have core code written in C, a configuration file format (which is a programming language), a Makefile (GNU Make is a fairly complex interpreter), Autoconf/Automake (even more complex tools for creating complex Makefiles, and then some Perl or Python to manipulate input or output data. The SE Linux project has code in all those languages plus the M4 macro language. It seems obvious that in a large project the number of people who are capable of understanding all languages which are used is going to be a small sub-set of all Debian Developers. It also seems likely that someone who knows some of the languages really well may be capable of doing a better job than someone who has a passable knowledge of all of them.

For the vast majority of my work in Debian I have been a member of the upstream project in question (for Bonnie++, Postal, Portslave, Maildir-bulletin, and logtools I’m the sole upstream developer). Often I start maintaining a package with the intent of doing the minimum needed to keep it maintained in Debian but then end up doing upstream work, for example I briefly had write access to the CVS repository for pppd when I was maintaining it and I had write access to the KDE CVS repository when I was working on KDE packages (which incidentally was before I became an official DD).

When I first started working on the SE Linux project I aimed to merely package the code in question for Debian. I was not planning on spending seven years doing upstream work!

When I’m working on a project I spend my time working on the things that most need attention. That often means that I get diverted from Debian packaging bugs (which are often minor issues) to work on upstream bugs. In many ways my work practices make me a Debian user and upstream developer who also does Debian packaging rather than a dedicated DD. This of course does have a detrimental affect on the quality of my Debian packaging work – but overall I believe that it’s best for the users!

As an upstream developer who is focussed on Debian there are significant advantages in being a DD. I can immediately upload a new package to Unstable without any delay, and there is the potential to take over a related package (either be given the package or NMU) if it’s the best way to get around a roadblock. Another benefit is that in certain limited situations I can speak on behalf of Debian (in terms of packages I “own”). When it comes to upstream development of SE Linux I can make decisions about how things will be done in Debian and work with other upstream developers to ensure that cross-distribution compatibility works reasonably well.

It seems to me that a fairly ideal process for managing Debian development would have the people who are capable of the hard coding doing so and some people who haven’t got specific coding skills working on the packaging. I would be very interested in receiving patches related to the Debian packaging of the packages I “own” (in Debian it’s widely regarded that a man’s package is his castle) so that I can spend more of my time on the C coding. I’m not bothered about NMUs (a Non-Maintainer Upload – is when someone else uploads a new version of one of my packages to fix a bug). If anyone would like to NMU one of my packages they are welcome to do so as long as there is a suitable combination of bug urgency, simplicity of the fix, and lack of response from me. If someone wants to start NMUing my packages to fix bugs reported by the latest version of Lintian in Unstable (Lintian is the Debian tool to automatically check for packaging bugs) then that’s OK.

Of course there are a variety of other situations that DDs sometimes find themselves in. There have been situations of overtly hostile upstream developers. I try and avoid associating with people who have certain difficulties in interacting with others. If someone is going to change their license and try to back-date the change or create a license that makes it difficult to distribute modified versions of the code then I’m inclined to avoid even using the software, let alone being involved in developing it. This means that the software that I use in any serious way has reasonable people doing upstream development who welcome new people to join their work.

SE Linux Etch Repository for AMD64

My Etch back-port repository of SE Linux related packages (which I documented in a previous post [1]) now has a complete set of packages for AMD64. From now on I aim to make AMD64 and i386 be my main supported platforms for SE Linux development.

There is a guy who may be able to give me a stack of well configured PowerMacs (2gigs of RAM), if he comes through with that then I may add PPC-32 to the list of architectures I support. If that happens then probably the machines will have their hard drives smashed for security reasons, so I’ll want to swap some G3 PowerMacs for hard drives.

3

My SE Linux Etch Repository

deb http://www.coker.com.au etch selinux

The above sources.list line has all the i386 packages needed for running SE Linux with strict policy on Etch as well as a couple of packages that are not strictly needed but which are really convenient (to solve the executable stack issue).

gpg --keyserver hkp://subkeys.pgp.net --recv-key F5C75256
gpg -a --export F5C75256 | apt-key add –

To use it without warnings you need to download and install my GPG key, the above two commands do this. You will of course have to verify my key in some way to make sure that it has not been replaced in a MITM attack.

The only thing missing is a change to /etc/init.d/udev to have a new script called /sbin/start_udev used to replace the make_extra_nodes function (so that the make_extra_nodes functionality can run in a different context). Of course a hostile init script could always exploit this to take over the more privileged domain, but I believe that running the init scripts in a confined domain does produce some minor benefits against minor bugs (as opposed to having the init scripts entirely owned).

I back-ported all the SE Linux libraries from unstable because the version in Etch doesn’t support removing roles from a user definition by the “semanage user -m” command (you can grant a user extra roles but not remove any roles). Trying to determine where in the libraries this bug occurred was too difficult.

Does anyone know of a good document on how to create repositories with apt-ftparchive? My current attempts are gross hacks but I’ve gone live anyway as the package data is good and the apt configuration basically works.

3

Google Custom Search Engine

I’ve just been experimenting with Google Custom Search [1]. Below are two custom search engines I created to generate searches for Planet Debian and Planet Ubuntu (for each Planet it searches all the blogs that are syndicated – and doesn’t just get the category that is syndicated). It’s interesting to compare search terms such as “selinux” to get an idea for how much topics are being discussed in the two communities. I’m going to set up cron jobs to update these CSEs as the Planet subscription lists change. Also it would be quite easy for me to set up a custom search that covers both Debian and Ubuntu, and other planets as well.

If nothing else this will save me from the problem of finding a blog post that has just scrolled off a Planet that I read.

Planet Debian (homepage):





Google Custom Search

Planet Ubuntu (homepage):





Google Custom Search

5

Comparing Debian and Fedora

A common question is how to compare Fedora [1] and Debian [2] in terms of recent updates and support. I think that Fedora Rawhide and Debian/Unstable are fairly equivalent in this regard, new upstream releases get packaged quickly, and support is minimal. They are both aimed at developers only, but it seems that a reasonable number of people are running servers on Debian/Unstable.

Fedora releases (previously known as “Fedora Core” and now merely as “Fedora”) can be compared to Debian/Testing. The aim is that Fedora releases every 6 months and each release is supported until a release two versions greater is about to be released (which means that it’s about a year of support). The support however often involves replacing the upstream version of the program used to make a package (EG Fedora Core 5 went from kernel 2.6.15 to kernel 2.6.20). I believe that the delays involved in migrating a package from Debian/Unstable to Debian/Testing as well as the dependency requirements mean that you can get a similar experience running Debian/Testing as you might get from Fedora.

Stable releases of Debian are rare and the updates are few in number and small in scope (generally back-porting fixes not packaging new upstream versions). This can be compared to Red Hat Enterprise Linux (RHEL) [3] or CentOS [4] (a free re-compile of RHEL with minor changes).

Regarding stability and support (in terms of package updates) I think that Debian/Stable, RHEL, and CentOS are at about the same level. RHEL has some significant benefits in terms of phone support (which is of very high quality). But if you don’t want to pay for phone support then CentOS and Debian/Stable are both good choices. Recently I’ve been rolling out a bunch of CentOS 5 machines for clients who don’t want to pay for RHEL and don’t want to pay for extensive customisation of the installation (a quick kickstart install is what they want). The benefit of Fedora and Debian/Testing over RHEL, CentOS, and Debian/Stable is that they get newer packages sooner. This is significant when using programs such as OpenOffice which have a steady development upstream that provides features that users demand.

If you want to try new features then Fedora and Debian/Testing are both options that will work. One reason I had been avoiding serious use of Debian/Testing is that it had no strategy for dealing with security fixes, but it seems that there are now security updates for Testing [5] (I had not realised this until today).

References:

  1. http://fedoraproject.org/
  2. http://www.debian.org/
  3. http://www.redhat.com/rhel/
  4. http://www.centos.org/
  5. http://secure-testing-master.debian.net/
8

Ethernet Bonding and a Xen Bridge

After getting Ethernet Bonding working (see my previous post) I tried to get it going with a bridge for Xen.

I used the following in /etc/network/interfaces to configure the bond0 device and to make the Xen bridge device xenbr0 use the bond device:

iface bond0 inet manual
pre-up modprobe bond0
pre-up ifconfig bond0 up
hwaddress ether 00:02:55:E1:36:32
slaves eth0 eth1

auto xenbr0
iface xenbr0 inet static
pre-up ifup bond0
address 10.0.0.199
netmask 255.255.255.0
gateway 10.0.0.1
bridge_ports bond0

But things didn’t work well. A plain bond device worked correctly in all my tests, but when I had a bridge running over it I had problems every time I tried pulling cables. My test for a bond is to boot the machine with a cable in eth0, then when it’s running switch the cable to eth1. This means there is a few seconds of no connectivity and then the other port becomes connected. In an ideal situation at least one port would work at all times – but redundancy features such as bonding are not for an ideal situation! When doing the cable switching test I found that the bond device would often get into a state where it every two seconds (the configured ARP ping time for the bond) it would change it’s mind about the link status and have the link down half the time (according to the logs – according to ping results it was down all the time). This made the network unusable.

Now I have deided that Xen is more important than bonding so I’ll deploy the machine without bonding.

One thing I am considering for next time I try this is to use bridging instead of bonding. The bridge layer will handle multiple Ethernet devices, and if they are both connected to the same switch then the Spanning Tree Protocol (STP) is designed to work in this way and should handle it. So instead of having a bond of eth0 and eth1 and running a bridge over that I would just bridge eth0, eth1, and the Xen interfaces.

Debian Lunch Meeting in Melbourne and BSP

This afternoon we had a Debian meeting in Melbourne (Australia) arranged through the Debian-Melb mailing list.

We met under the clocks at Flinders St station, had lunch at a good Japanese restaurant, decided not to play LASER games (like paintball but with LASER guns instead of paint guns) due to the queue. The LASER games are at the Crown Casino, some people object to it on principle, but when you only use free tickets… One noteworthy thing about the casino is that they have a free cloak-room that stores bags (back-packs etc are not allowed on the gaming floor). I expect their cloak-room to be a little more secure than most places that you might stash your stuff (they have a reputation for security to uphold) so I felt safe leaving a back-pack containing a laptop in their care (it didn’t have any secret data and was a really old one).

After leaving the casino we had Gelati/Gelato ice-cream (Gelati is the plural of Gelato and either word may be used to describe Italian style ice-cream).

The general plan for the next meeting is to meet in the city at about 10AM on a weekend, play the LASER games, and then have lunch. Of course that would make it a greater requirement for people to arrive on time. ;)

While at the meeting we discussed in concept the idea of a Bug Squashing Party (BSP). I can get a free venue for up to 12 people outside business hours which is not far from the center of Melbourne and which has good net access and a good supply of keyboards, monitors, and other misc computer bits (even possibly some PCs that can have their hard drives temporarily replaced to test Debian stuff). One guy who is rather keen on this idea asked if it would be possible to bring sleeping-bags and sleep on the floor. I hesitate to ask the guy who owns the office about that, it might make him reject the idea entirely. So probably starting at about 10AM and going to 10PM would be enough. We could do that both Saturday and Sunday on some weekend or maybe even start on Friday night.

I’ve been planning to run a similar meeting to play Linux games which may end up as a games hacking party. I might get around to running that soon.

An Ideal Linux Install Process for Xen

I believe that an ideal installation process for Linux would have the option of performing a Xen install.

The basic functionality of installing the Xen versions of the required packages (the kernel and libc), the Xen hypervisor, and the Xen tools is already done well in Fedora and it’s an option to install them in Debian. But more than that is required.

Xen has two options for networking, bridging and routing. The bridging option can be confusing to set up and changing a system from routed to bridged networking once it’s running is a risky process. I have documented the basic requirements for running bridging in a previous post, but it would be better if there was an option to have Xenbr0 as the primary device from the initial install – and there are non-Xen reasons for doing this so it would be a more generally useful feature.

Another common requirement for a Xen server is to have a DVD image on the local hard drive for creating new DomU’s. If we are going to need a copy of the DVD on the local hard drive for Xen installation and we need data from the DVD for the Dom0 installation then it makes sense to have one of the early installation tasks (immediately after running mkfs) be to copy the contents of the DVD to the hard drive. Hard drives are significantly faster than DVDs – especially for random access. It would also avoid the not uncommon annoyance of getting part way through an install only to encounter a DVD or CD read error…

Here are some reasons for running Xen (or an equivalent technology) when not running more than one DomU:

  1. Avoid problems booting. Everyone who has spent any significant amount of time running servers has had problems where machines don’t boot. Even with a capable out of band management option such as the HP ILO it can be unreasonably inconvenient to fix such problems. Separating the base hardware management tasks of the OS from the user process management tasks makes recovery much easier. If a DomU stops booting then it’s easy to mount it on the Dom0 and chroot into it to discovere the problem.
  2. Easier upgrades. Often you have users demand that you install software that only works with a newer version of the OS. You can install the new version under a different DomU, test it, and then replace the old DomU when you think it’ll work – this gives a matter of minutes of down-time instead of hours for the upgrade. If the upgrade doesn’t work then you destroy the DomU and create one for the old version. Running two versions of the OS at the same time with NFS shares for the data files is also possible.
  3. Security. If a DomU gets cracked the Dom0 will not necessarily be compromised, this puts you in a good position to track down what the attackers have done. You can get a dump of the DomU’s memory to enable experts to examine what the attackers were doing. Reinstalling a DomU to replace data potentially corrupted by an attacker is much easier than reinstalling an entire machine.

Even in situations when reason #2 was the motivation for installing Xen I believe that most systems will want to have a Xen DomU running the same version as the Dom0 for the initial install. Therefore integrating the installation process would make things easier. Among other benefits if you have a server with multiple CPUs (the minimum number seems to be two CPUs on all recent machines) and hardware RAID then doing two installations at the same time is likely to give better performance overall. Also I believe that it will often be the case that the Dom0 will exist purely to support DomU’s, therefore if you only install the Dom0 then you have done less than half the installation!

For a manual installation there are some reasons for not doing this all at the same time. Having the sys-admin enter configuration data for some DomU’s at the same time as the Dom0 can get confusing. However for an automated install this would be desirable. I would like to boot from a CD and have the installation process take all configuration from the network (either via NFS or HTTP) and then perform the complete installation of the Dom0 and the DomU’s automatically.

Let me know what you think of these ideas, it’s just at the conceptual stage at the moment.

8

SE Linux in Debian

I have now got a Debian Xen domU running the strict SE Linux policy that can boot in enforcing mode. I expect that tomorrow I will have it working with full functionality and that I will be able to run another SE Linux Play Machine in the near future.

After getting the strict policy working I want to build a Debian kernel with CONFIG_AUDITSYSCALL and an audit package so that I can audit system calls that an application makes and also so that the auditd can collect the SE Linux log messages. Other people have talked about packaging audit for Debian, hopefully one of them will do it first and save me the effort, but it shouldn’t be too difficult to do if they don’t.

Then I need to investigate some options for training people about SE Linux. As I don’t currently have the bandwidth for serving large files I’m thinking of basing some SE Linux training on Xen images from the jailtime.org repository. My rough plan at the moment is to have people download Xen images, run through them while consulting a web page, and ask questions on an IRC channel. I’m not sure what the demand will be for this but some web pages teaching people about SE Linux will be a useful resource even if the IRC based training doesn’t work out.

Another thing I want to do is to get PolyInstantiated Directories working in Debian. The pam_namespace.so module needed for this is written for a more recent version of PAM, so I might just work on merging the Debian patches with the latest upstream PAM instead of back-porting the module to the ancient Debian PAM.