10

SE Linux Support in GPG

In May 2002 I had an idea for securing access to GNUPG [1]. What I did was to write SE Linux policy to only permit the gpg program to access the secret key (and other files in ~/.gnupg). This meant that the most trivial ways of stealing the secret key would be prevented. However an attacker could still use gpg to encrypt it’s secret key and write the data to some place that is accessible, for example the command “gpg -c --output /tmp/foo.gpg ~/.gnupg/secring.gpg“. So what we needed was for gpg to either refuse to encrypt such files, or to spawn a child process for accessing such files (which could be granted different access to the filesystem). I filed the Debian bug report 146345 [2] requesting this feature.

In March upstream added this feature, the Debian package is currently not built with --enable-selinux-support so this feature isn’t enabled yet, but hopefully it will be soon. Incidentally the feature as currently implemented is not really SE Linux specific, it seems to me that there are many potential situations where it could be useful without SE Linux. For example if you were using one of the path-name based MAC systems (which I dislike – see what my friend Joshua Brindle wrote about them for an explanation [3]) then you could gain some benefits from this. A situation where there is even smaller potential for benefit is in the case of an automated system which runs gpg which could allow an attacker to pass bogus commands to it. When exploiting a shell script it might be easier to specify the wrong file to encrypt than to perform more sophisticated attacks.

When the feature in question is enabled the command “gpg -c --output /tmp/foo.gpg ~/.gnupg/secring.gpg” will abort with the following error:
gpg: can’t open `/root/.gnupg/secring.gpg’: Operation not permitted
gpg: symmetric encryption of `/root/.gnupg/secring.gpg’ failed: file open error

Of course the command “gpg --export-secret-keys” will abort with the following error:
gpg: exporting secret keys not allowed
gpg: WARNING: nothing exported

Now we need to determine the correct way of exporting secret keys and modifying the GPG configuration. It might be best to allow exporting the secret keys when not running SE Linux (or other supported MAC systems), or when running in permissive mode (as in those situations merely copying the files will work). Although we could have an option in gpg.conf for this for the case where we want to prevent shell-script quoting hacks.

For editing the gpg.conf file and exporting the secret keys we could have a program similar in concept to crontab(1) which has PAM support to determine when it should perform it’s actions. Also it seems to me that crontab(1) could do with PAM support (I’ve filed Debian bug report 484743 [4] requesting this).

Finally one thing that should be noted is that the targeted policy for SE Linux does not restrict GPG (which runs in the unconfined_t domain). Thus most people who use SE Linux at the moment aren’t getting any benefits from such things. This will change eventually.

5

Installing a Red Hat based DomU on a Debian Dom0

The first step is to copy /images/xen/vmlinuz and /images/xen/initrd.img from the Fedora (or RHEL or CentOS) DVD somewhere convenient, I use /boot/OS/ (where OS is the name of the image) but other locations will do.

Now choose a suitable Ethernet MAC address for the interface (see my previous post on how I choose them [1]).

Create a temporary block device for the install, I use /dev/VG0/OS-install (where OS is replaced by the name of the distribution, “f8” or “cent5“), it’s a logical volume in an LVM volume group named VG0. The device should be at least 2G in size for a basic Fedora install (512M for swap, 1G for files, and 512M free after the install). It is of course possible to use DOS partitions for the Xen block devices, but this would be unreasonably difficult to manage. An option for people who don’t like LVM would be to use files on an XFS filesystem (Ext3 performs poorly when creating and removing large files).

When configuring Xen on Debian systems I generally use /dev/hda type device names. The device name seems quite arbitrary and /dev/hda is a familiar name for hard drives that many people have been used to for 15+ years. But the Fedora install process doesn’t like it and I’m forced to use /dev/xvda etc.

I often install Fedora on a machine that only has 256M of RAM spare for the DomU. For recent versions of Fedora 256M of RAM is about the minimum for an install at the best of times, and a HTTP install takes even more because the root filesystem used for the install is copied via HTTP and stored in a RAM disk. It might be possible to use less RAM with a CD or DVD install or even a NFS install, but I couldn’t get CD/DVD installation working and I generally don’t give Xen DomU’s NFS access if I can avoid it. So I had to create a swap space (an attempt to do an install with 256M of RAM and no swap aborted when installing the kernel package). I expect that most serious use of Xen will have 256M of RAM or less for the DomU, part of the problem here is that Xen allocates RAM not virtual memory. VMWare allocates virtual memory so the total memory for virtual machines can be greater than physical RAM and thus this problem will be less common with VMWare.

I believe that the best way of configuring virtual machine images is to have the virtual machine manager (Xen in this case) provide block devices to the virtual machine and have the virtual machine implement no partitioning (no LVM or anything equivalent). The main reason is that DOS partition tables and LVM configuration on a block device used by Xen can not be used easily in the host environment (the Dom0 for Xen). I am not aware of how to access DOS partition tables (although I’m sure it’s possible somehow) and while LVM can be used it’s a bad idea due to the fact that there is no way to deactivate a LVM volume group that is active, and the fact that there is no support for having multiple volume groups of the same name. The lack of support for multiple volume groups of the same name is a reasonable limitation, but an insurmountable problem when using a virtual machine environment. It’s quite reasonable to create several cloned instances of a virtual machine and renaming an LVM volume group would require more changes inside the virtual machine than you would want. Also using snap-shots of old versions of the virtual machine data is difficult if the same volume group name is used.

So for ease of management I want to have filesystems on block devices (such as /dev/xvda) instead of partitions (such as /dev/xvda1). Unfortunately Anaconda (the Fedora installer) doesn’t support this. So I had to do the initial install with DOS partitions and then fix it afterwards. So use the manual option and create a primary partition for the root filesystem and then create a non-primary partition for swap (when using small amounts of RAM such as 256M) so that swap can be used during the install. The root filesystem needs to be at the start of the disk to make it easier to sort this out later.

After installing Fedora and shutting the virtual machine down the next step is to copy the block device to the desired configuration (filesystem on an unpartitioned device). If the root filesystem is the first partition then the first 63 sectors will be the partition table and reserved space so dd can be used to copy the data with the following commands:

dd if=/dev/VG0/OS-install of=/dev/VG0/OS bs=512 skip=63
e2fsck -f /dev/VG0/OS
resize2fs /dev/VG0/OS

The next step is to mount the device /dev/VG0/OS in the Dom0 to change /etc/fstab, I use /dev/xvda for the root device and /dev/xvdb for swap.

Now to remove the cruft:
Avahi is a network service discovery system, mainly used for laptops and isn’t needed on a server, it is installed by default on all recent Fedora, RHEL and CentOS releases but it is not useful in a DomU (any unused network service is a security risk if you don’t disable or remove it). Smartmontools is for detecting impending failure of a hard drive and does not do any good when using a virtual block device (you run it on the Dom0). It might be considered a bug that smartd doesn’t exit on startup when it sees a device such as /dev/xvda. The pcsc-lite package for managing smart cards is of no use to me and all the other people who don’t own readers for smart-cards, and it can therefore be removed. Bluetooth networking support (in the package bluez-utils) is also only usable in a Dom0 (AFAIK), and the only bluetooth device I own is my mobile phone so I can’t use it on my computer. The command “yum remove avahi smartmontools pcsc-lite bluez-utils” removes them.

For almost all of my DomU’s I don’t use NFS or do any printing, so I remove the packages related to them. Also autofs is in most cases only useful for servers when mounting NFS filesystems. I remove them with the command “yum remove nfs-utils portmap cups autofs“.

The GPM daemon (which supports cut/paste operations with a mouse on virtual consoles) is of no use on a Xen DomU, unfortunately the vim-enhanced package depends on it. I could just disable the daemon, but as I like to run small images I remove it with “yum remove gpm“. I may have to reinstall it on some images as some of my clients like the extra VIM functionality.

It’s unfortunate that debootstrap doesn’t work on CentOS (and presumably Fedora) so installing a Debian DomU on a CentOS/Fedora Dom0 requires creating an image on a Debian machine or downloading an image from www.jailtime.org .

Sample Xen Config for the install:

kernel = “/boot/OS/vmlinuz”
ramdisk = “/boot/OS/initrd.img”
memory = 256
name = “OS”
vif = [ ‘mac=00:16:3e:66:66:68, bridge=xenbr0’ ]
disk = [ ‘phy:/dev/VG0/OS-install,xvda,w’ ]
extra=”askmethod text”

Sample Xen Config for operation:

kernel = “/boot/cent5/vmlinuz-2.6.18-53.el5xen”
ramdisk = “/boot/cent5/initrd-2.6.18-53.el5xen.img”
memory = 256
name = “cent5”
vif = [ ‘mac=00:16:3e:66:66:68, bridge=xenbr0’ ]
disk = [ ‘phy:/dev/VG0/cent5,xvda,w’, ‘phy:/dev/VG0/cent5-swap,xvdb,w’ ]
root = “/dev/xvda ro”

5

Security Flaws in Free Software

I just wrote about the system administration issues related to the recent Debian SSL/SSH security flaw [1]. The next thing we need to consider is how we can change things to reduce the incidence of such problems.

The problem we just had was due to the most important part of the entropy supply for the random number generator not being used due to a mistake in commenting out some code. The only entropy that was used was the PID number of the process which uses the SSL library code which gives us 15 bits of entropy. It seems to me that if we had zero bits of entropy the problem would have been discovered a lot sooner (almost certainly before the code was released in a stable version). Therefore it seems that using a second-rate source of entropy (which was never required) masked the problem that the primary source of entropy was not working. Would it make sense to have a practice of not using such second-rate sources of entropy to reduce the risk of such problems being undetected for any length of time? Is this a general issue or just a corner case?

Joss makes some suggestions for process improvements [2]. He suggests that having a single .diff.gz file (the traditional method for maintaining Debian packages) that directly contains all patches can obscure some patches. The other extreme is when you have a patch management system with several dozen small patches and the reader has to try and discover what each of them does. For an example of this see the 43 patches which are included in the Debian PAM package for Etch, also note that the PAM system is comprised of many separate shared objects (modules), this means that the patching system lends itself to having one patch per module and thus 43 patches for PAM isn’t as difficult to manage as 43 patches for a complex package which is not comprised of multiple separate components might be. That said I think that there is some potential for separating out patches. Having a distinction between different types of patches might help. For example we could have a patch for Makefiles etc (including autoconf etc), a patch for adding features, and a patch for fixing bugs. Then people reviewing the source for potential bugs could pay a lot of attention to bug fixes, a moderate amount of attention to new features, and casually skim the Makefile stuff.

The problem began with this mailing list discussion [3]. Kurt’s first message starts with “When debbuging applications” and ends with “What do you people think about removing those 2 lines of code?“. The reply he received from Ulf (a member of the OpenSSL development team) is “If it helps with debugging, I’m in favor of removing them“. It seems to me that there might have been a miscommunication there, Ulf may have believed that the discussion only concerned a debugging built and not a build that would eventually end up on millions of machines.

It seems possible that the reaction would have been different if Kurt had mentioned that he wanted to have a single source tree for both debugging and for regular use. It also seems likely that his proposed change may have received more inspection if he had clearly stated that he was doing to include it in Debian where it would be used by millions of people. When I am doing Debian development I generally don’t mention all the time “this code will be used by millions of people so it’s important that we get it right“, although I do sometimes make such statements if I feel that my questions are not getting the amount of consideration from upstream that a binary package destined for use by millions of people deserves. Maybe it would be a good practice to clarify such things in the case of important packages. For a package that is relevant to the security of the entire distribution (and possibly to other machines around the net – as demonstrated in this case) it doesn’t seem unreasonable to include a post-script mentioning the scope of the code use (it could be done with an alternate SIG if using a MUA that allows selecting from multiple SIGs in a convenient manner).

In the response from the OpenSSL upstream [4] it is claimed that the mailing list used was not the correct one. Branden points out that the openssl-team mailing list address seems not to be documented anywhere [5]. One thing to be learned from this is that distribution developers need to be proactive in making contact with upstream developers. You might think that building packages for a major distribution and asking questions about it on the mailing list would result in someone from the team noticing and mentioning any other things that you might need to do. But maybe it would make sense to send private mail to one of the core developers, introduce yourself, and ask for advice on the best way to manage communication to avoid this type of confusion.

I think that it is ideal for distribution developers to have the phone numbers of some of the upstream developers. If the upstream work is sponsored by the employer of one of the upstream developers then it seems reasonable to ask for their office phone number. Sometimes it’s easier to sort things out by phone than by email.

Gunnar Wolf describes how the way this bug was discovered and handled shows that the Debian processes work [6]. A similar bug in proprietary software would probably not be discovered nearly as quickly and would almost certainly not be fixed in such a responsible manner.

Update: According to the OpenSSL project about page [7], Ulf is actually not a “core” member, just a team member. I had used the term “core” in a slang manner based on the fact that Ulf has an official @openssl.org email address.

14

Debian SSH Problems

It has recently been announced that Debian had a serious bug in the OpenSSL code [1], the most visible affect of this is compromising SSH keys – but it can also affect VPN and HTTPS keys. Erich Schubert was one of the first people to point out the true horror of the problem, only 2^15 different keys can be created [2]. It should not be difficult for an attacker to generate 2^15 host keys to try all combinations for decrypting a login session. It should also be possible to make up to 2^15 attempts to login to a session remotely if an attacker believes that an authorized key was being used – that would take less than an hour at a rate of 10 attempts per second (which is possible with modern net connections) and could be done in a day if the server was connected to the net by a modem.

John Goerzen has some insightful thoughts about the issue [3]. I recommend reading his post. One point he makes is that the person who made the mistake in question should not be lynched. One thing I think we should keep in mind is the fact that people tend to be more careful after they have made mistakes, I expect that anyone who makes a mistake in such a public way which impacts so many people will be very careful for a long time…

Steinar H. Gunderson analyses the maths in relation to DSA keys, it seems that if a DSA key is ever used with a bad RNG then it can be cracked by someone who sniffs the network [4]. It seems that it is safest to just not use DSA to avoid this risk. Another issue is that if a client supports multiple host keys (ssh version 2 can use three different key types, one for the ssh1 protocol, one for ssh2 with RSA, and one for ssh2 with DSA) then a man in the middle attack can be implemented by forcing a client to use a different key type – see Stealth’s article in Phrack for the details [5]. So it seems that we should remove support for anything other than SSHv2 with RSA keys.

To remove such support from the ssh server edit /etc/ssh/sshd_config and make sure it has a line with “Protocol 2“, and that the only HostKey line references an RSA key. To remove it from the ssh client (the important thing) edit /etc/ssh/ssh_config and make sure that it has something like the following:

Host *
Protocol 2
HostKeyAlgorithms ssh-rsa
ForwardX11 no
ForwardX11Trusted no

You can override this for different machines. So if you have a machine that uses DSA only then it would be easy to add a section:

Host strange-machine
Protocol 2
HostKeyAlgorithms ssh-dsa

So making the default configuration of the ssh client on all machines you manage has the potential to dramatically reduce the incidence of MITM attacks from the less knowledgable users.

When skilled users who do not have root access need to change things they can always edit the file ~/.ssh/config (which has the same syntax as /etc/ssh/ssh_config) or they can use command-line options to override it. The command ssh -o “HostKeyAlgorithms ssh-dsa” user@server will force the use of DSA encryption even if the configuration file requests RSA.

Enrico Zini describes how to use ssh-keygen to get the fingerprint of the host key [6]. One thing I have learned from comments on this post is how to get a fingerprint from a known hosts file. A common situation is that machine A has a known hosts file with an entry for machine B. I want to get the right key in machine C and there is no way of directly communicating between machine A and machine C (EG they are in different locations with no network access). In that situation the command “ssh-keygen -l -f ~/.ssh/known_hosts” can be used to display all the fingerprints of hosts that you have connected to in the past, then it’s a simple matter of grepping the output.

Docunext has an interesting post about ways of mitigating such problems [7]. One thing that they suggest is using fail2ban to block IP addresses that appear to be trying to do brute-force attacks. It’s unfortunate that the version of fail2ban in Debian uses /tmp/fail2ban.sock for it’s Unix domain socket for talking to the server (the version in Unstable uses /var/run/fail2ban/fail2ban.sock). They also mention patching network drivers to add entropy to the kernel random number generator. One thing that seems interesting is the package randomsound (currently in Debian/Unstable) which takes ALSA sound input as a source of entropy, note that you don’t need to have any sound input device connected.

When considering fail2ban and similar things, it’s probably best to start by restricting the number of machines which can connect to your SSH server. Firstly if you put it on a non-default port then it’ll take some brute-force to find it. This will waste some of the attacker’s time and also make the less persistent attackers go elsewhere. One thing that I am considering is having a few unused ports configured such that any IP address which connects to them gets added to my NetFilter configuration – if you connect to such ports then you can’t connect to any other ports for a week (or until the list becomes too full). So if for example I had port N configured in such a manner and port N+100 used for ssh listening then it’s likely that someone who port-scans my server would be blocked before they even discovered the SSH server. Does anyone know of free software to do this?

The next thing to consider is which IP addresses may connect. If you were to allow all the IP addresses from all the major ISPs in your country to connect to your server then it would still be a small fraction of the IP address space. Sure attackers could use machines that they already cracked in your country to launch their attacks, but they would have to guess that you had such a defense in place, and even so it would be an inconvenience for them. You don’t necessarily need to have a perfect defense, you only need to make the effort to reward ratio be worse for attacking you than for attacking someone else. Note that I am not advocating taking a minimalist approach to security, merely noting that even a small increment in the strength of your defenses can make a significant difference to the risk you face.

Update: based on comments I’m now considering knockd to open ports on demand. The upstream site for knockd is here [8], and some documentation on setting it up in Debian is here [9]. The concept of knockd is that you make connections to a series of ports which act as a password for changing the firewall rules. An attacker who doesn’t know those port numbers won’t be able to connect. Of course anyone who can sniff your network will discover the ports soon enough, but I guess you can always login and change the port numbers once knockd has let you in.

Also thanks to Helmut for advice on ssh-keygen.

9

Ideas to Copy from Red Hat

I believe that the Red Hat process which has Fedora for home users (with a rapid release cycle and new versions of software but support for only about one year) and Enterprise Linux (with a ~18 month release cycle, seven years of support, and not always having the latest versions) gives significant benefits for the users.

The longer freeze times of Enterprise Linux (AKA RHEL) mean that it often has older versions of software than a Fedora release occurring at about the same time. In practice the only time I ever notice users complaining about this is in terms of OpenOffice (which is always being updated for compatability with the latest MS changes). As an aside, a version of RHEL or CentOS with a back-port of the latest OpenOffice would probably get a lot of interest.

RHEL also has a significantly smaller package set than Fedora, there is a lot of software out there that you wouldn’t want to support for seven years, a lot of software that you might want to support if you had more resources, and plenty of software that is not really of interest to enterprise customers (EG games).

Now there are some down-sides to the Red Hat plan. The way that they run Fedora is to have new releases of software instead of back-porting fixes. This means that bugs can be fixed with less effort (simply compiling a new version is a lot less effort than back-porting a fix), and that newer versions of the upstream code get tested. With some things this isn’t a problem, but in the past I have had problems with the Fedora kernel. One example was when I upgraded the kernel on a bunch of remote Fedora machines only to find that the new kernel didn’t support the network card, so I had to talk the users through selecting the older kernel at the GRUB menu (this caused pain and down-time). A problem with RHEL (which I see regularly on the CentOS machines I run) is that it doesn’t have the community support that Fedora does, and therefore finding binary packages for RHEL can be difficult – and often the packages are outdated.

I believe that in Debian we could provide benefits for some of our users by copying some ideas from Red Hat. There is currently some work in progress on releasing packages that are half-way between Etch and Lenny (Etch is the current release, Lenny will be the next one). The term Etch and a half refers to the work to make Etch run on newer hardware [1]. It’s a good project, but I don’t think that it goes far enough. It certainly won’t fulfill the requirements of people who want something like Fedora.

I think that if we had half-way releases of Debian (essentially taking a snap-shot of Testing and then fixing the worst of the bugs) then we could accommodate user demand for newer versions (making available a release which is on average half as old). Users who want really solid systems would run the full releases (which have more testing pre-release and more attention paid to bug fixes), but users who need the new features could run a half-way release. Currently there are people working on providing security support for Testing so that people who need the more recent versions of software can use Testing, I believe that making a half-way release would provide better benefits to most users while also possibly taking less resources from the developers. This would not preclude the current “Etch and a half” work of back-porting drivers, in the Red Hat model such driver back-ports are done in the first few years of RHEL support. If we were to really follow Red Hat in this regard the “Etch and a half” work would operate in tandem with similar work for Sarge (version 3.1 of Debian which was released in 2005)!

In summary, the Red Hat approach is to have Fedora releases aimed at every 6 months, but in practice coming out every 9 months or so and to have Enterprise Linux releases aimed at every year, but in practice coming out every 18 months. This means among other things that there can be some uncertainty as to the release order of future Fedora and RHEL releases.

I believe that a good option for Debian would be to have alternate “Enterprise” (for want of a better word) and half-way releases (comparable to RHEL and Fedora). The Enterprise releases could be frozen in coordination with Red Hat, Ubuntu, and other distributions (Mark Shuttleworth now refers to this as being a “pulse” in the free software community [], while the half-way releases would come out either when it’s about half-way between releases, or when there is a significant set of updates that would encourage users to switch.

One of the many benefits to having synchronised releases is that if the work in back-porting support for new hardware lagged in Debian then users would have a reasonable chance of taking the code from CentOS. If nothing else I think that making kernels from other distributions available for easy install is a good thing. There is a wide combination of kernel patches that may be selected by distribution maintainers, and sometimes choices have to be made between mutually exclusive options. If the Debian kernel doesn’t work best for a user then it would be good to provide them with a kernel compiled from the RHEL kernel source package and possibly other kernels.

Mark also makes the interesting suggestion of having different waves of code freeze, the first for the kernel, GCC, and glibc, and possibly server programs such as Apache. The second for major applications and desktop environments. The third for distributions. One implication of this is that not all distributions will follow the second wave. If a distribution follows the kernel, GCC, and glibc wave but not the applications wave it will still save some significant amounts of effort for the users. It will mean that the distributions in question will all have the same hardware support and kernel features, and that they will be able to run each others’ applications (except when the applications in question use system libraries from later waves). Also let’s not forget the possibility of running a kernel from distribution A on distribution B, it’s something I’ve done on many occasions, but it does rely on the kernels in question being reasonably similar in terms of features.

18

Release Dates for Debian

Mark Shuttleworth has written an interesting post about Ubuntu release dates [1]. He claims that free software distributions are better able to meet release dates than proprietary OSs because they are not doing upstream development. The evidence that free software distributions generally do a reasonable job of meeting release dates (and Ubuntu does an excellent job) is clear.

But the really interesting part of his post is where he offers to have Ubuntu collaborate with other distributions on release dates. He states that if two out of Red Hat (presumably Enterprise Linux), Novell (presumably SLES), and Debian will commit to the same release date (within one month) and (possibly more importantly) to having the same versions of major components then he will make Ubuntu do the same.

This is a very significant statement. From my experience working in the Debian project and when employed by Red Hat I know that decisions about which versions of major components to include are not taken lightly, and therefore if the plan is to include a new release of a major software project and that project misses a release date then it forces a difficult decision about whether to use an older version or delay the release. For Ubuntu to not merely collaborate with other distributions but to instead follow the consensus of two different distributions would be a massive compromise. But I agree with Mark that the benefits to the users are clear.

I believe that the Debian project should align it’s release cycles with Red Hat Enterprise Linux. I believe that RHEL is being released in a very sensible manner and that the differences of opinion between Debian and Red Hat people about how to manage such things are small. Note that it would not be impossible to have some variations of version numbers of components but still stick mostly to the same versions.

If Debian, Ubuntu, and RHEL released at about the same time with the same versions of the kernel, GCC, and major applications and libraries then it would make it much easier for users who want to port software between distributions and run multiple distributions on the same network or the same hardware.

The Debian Social Contract [2] states that “Our priorities are our users and free software“. I believe that by using common versions across distributions we would help end-users in configuring software and maintaining networks of Linux systems running different distributions, and also help free software developers by reducing the difficulty in debugging problems.

It seems to me that the best way of achieving the goal that Mark advocates (in the short term at least) is for Debian to follow Red Hat’s release cycle. I think that after getting one release with common versions out there we could then discuss how to organise cooperation between distributions.

I also believe that a longer support cycle would be a good thing for Debian. I’m prepared to do the necessary work for the packages that I maintain and would also be prepared to do some of the work in other areas that is needed (EG back-porting security fixes).

9

The Purpose of Planet Debian

An issue that causes ongoing discussion is what is the purpose of a Planet installation such as Planet Debian [1]. The discussion usually seems to take the less effective form of what is “appropriate” content for the Planet or what is considered to be “abuse” of the Planet. Of course it’s impossible to get anything other than a rough idea of what is appropriate is the purpose is not defined, and abuse can only be measured on the most basic technical criteria.

My personal use of Planet Debian and Planet Linux Australia [2] is to learn technical things related to Linux (how to use new programs, tricks and techniques, etc), to learn news related to Linux, and to read personal news about friends and colleagues. I think that most people have some desire to read posts of a similar nature (I have received a complaint that my blog has too many technical posts and not enough personal posts), but some people want to have a Planet with only technical articles.

In a quick search of some planets the nearest I found to a stated purpose of a Planet installation was from the Wiki to document Planet Ubuntu [3] which says ‘Subscribed feeds ought to be at least occasionally relevant to Ubuntu, although the only hard and fast rule is “don’t annoy people”‘. Planet Perl [4] has an interesting approach, they claim to filter on Perl related keywords, I initially interpreted this to mean that if you are on their list of blogs and you write a post which seems to refer to Perl then it will appear – but a quick browse of the Planet shows some posts which don’t appear to match any Perl keywords. Gentoo has implemented a reasonable system, they have a Universe [5] configuration which has all blog posts by all Gentoo bloggers as well as a Planet installation which only has Gentoo related posts.

It seems to me that the a reasonable purpose for Planet Debian would be to have blog feeds which are occasionally specific to Debian and often relevant to Debian. Personal blog posts would be encouraged (but not required). Posts which are incomprehensible or have nothing to say (EG posts which link to another post for the sole purpose of agreeing or disagreeing) would be strongly discouraged and it would be encouraged to make links-posts rare.

Having two installations of the Planet software, one for posts which are specific to Debian (or maybe to Debian or Linux) and one for all posts by people who are involved with Debian would be the best option. Then people who only want to read the technical posts can do so, but other people can read the full list. Most blog servers support feeds based on tag or category (my blog already provides a feed of Debian-specific posts). If we were going to have a separate Planet installation for only technical posts then I expect that many bloggers would have to create a new tag for such posts (for example my posts related to Debian are in the categories Benchmark, Linux, MTA, Security, Unix-tips, and Xen) and the tag Debian is applied to only a small portion of such posts. But it would be easy to create a new tag for technical posts.

Ubuntu is also the only organisation I’ve found to specify conditions upon which blogs might be removed from the feed, they say: We reserve the right to remove any feed that is inaccessible, flooding the page, or otherwise interfering with the operation of the Planet. We also have the right to move clearly offensive content or content that could trigger legal action.

That is reasonable, although it would be good to have a definition for “flooding the page” (I suggest “having an average of more than two posts per day appear over the period of a week or having posts reappear due to changing timestamps”). Also the “could trigger legal action” part is a minor concern – product reviews are often really useful content on a Planet…

Some time ago my blog was removed from Planet Fedora for some reason. I was disappointed that the person who made that change didn’t have the courtesy to inform me of the reason for their action and by the fact that there is no apparent way of contacting the person who runs the Planet to ask them about it. Needless to say this did not encourage me to write further posts about Fedora.

If a blog has to be removed from a feed due to technical reasons then the correct thing to do is to inform the blogger of why it’s removed and what needs to be fixed before it can be added again.

If a blog is not meeting the content criteria then I expect that in most cases the blogger could be convinced to write more content that matches the criteria and tag it appropriately. Having criteria for some aspects of blog quality and encouraging the bloggers to meet the criteria can only improve the overall quality.

Currently there is a Planet installation on debian.net being recommended which is based on Planet Debian, but with some blogs removed (with no information available publicly or on debian-private as to what the criteria are for removing the blogs in question). It seems to me that if it’s worth using Debian resources to duplicate the Planet Debian then it should be done in a way that benefits readers (EG by going to the Planet vs Universe model that Ubuntu follows), and that if blogs are going to be removed from the feed then there should be criteria for the removal so that anyone who wants their blog to be syndicated can make whatever changes might be necessary.

3

Planets and Resignations

Recently a Debian Developer resigned from a position of responsibility in the project by writing a blog post. I won’t name the DD or the position he resigned as I think that there are general issues which need discussion and specific examples will get in the way (everyone who is seriously involved will know who it is anyway – for those who don’t know, it’s not really exciting).

Also I think that the issue of the scope of a Planet installation is of wider importance than the Debian project, so it would be of benefit for outsiders who stumble upon this to see a discussion of general issues rather than some disagreements within the Debian project.

There has been some mild criticism of the DD in question for announcing his resignation via a blog post. I don’t think that this is appropriate. In the absence of evidence to the contrary I’ll assume that the DD in question announced his resignation to the relevant people (probably the team he worked with and the Debian Project Leader) via private email which was GPG signed (if he indeed intended to formally resign).

The resignation of one DD from one of the many positions of authority and responsibility in the project is not going to have a great affect on the work of most DDs. Therefore I don’t think that it was necessarily a requirement to post to the debian-private mailing list (the main list for communication between all developers regarding issues within the project) about this. It was however an issue that was bound to get discussed on debian-private (given that the circumstances of the resignation might be considered to be controversial) so it seems to me that sending an email of the form “here is a blog post I’ve written about my resignation” would have saved some pointless discussion (allowing us to skip the “why didn’t you send email” and get right on to the main discussion).

A resignation letter from a public position of responsibility is a significant document. Having such documents stored on publicly accessible places is good for the community. Having a record of all such documents that you have written stored on your own server for reference (by yourself and by other people you work with) is a good thing. Therefore it seems to me that a blog is an ideal place for a resignation letter. It used to be regarded that there was a certain formality in such things, and that a letter of resignation was required to be delivered in the most direct way possible (by hand if convenient) to the person who receives it. If such conventions were followed then a blog post would occur after the receipt of the letter of resignation had been confirmed (possibly in this case a confirmation email from the DPL). But in recent times things have become less formal and the free software community is particularly informal. So it seems quite appropriate to me to have the blog post come first and the email notification merely contain the URL.

Now a letter of resignation is expected to contain certain specific details. It should say specifically what duties are being resigned (particularly important when a person performs many tasks), it should have a date from which it will take effect, and it might be appropriate to mention issues related to the hand-over of tasks (whether the person resigning is willing to work with their replacements).

The “resignation” (if we should call it that) in question did not contain any of the specific details that I would expect to see in a formal resignation. This indicates to me that it could be interpreted as not being a formal and official resignation, but instead being a post (possibly written in haste while angry) about a situation which may not end up being an official resignation. Until we get some more information we won’t know for sure either way.

This demonstrates one problem with blogs, people usually have a mixture of serious documents and trivial things on the one blog. It can be difficult to determine how seriously to take blog posts. I’m not sure that there can be a good solution to this.

At the moment some people are suggesting that every DD should read Planet Debian [1]. I disagree with that. If there is an issue which is significant and affects the entire project then it should be announced on one of the mailing lists such as debian-private, debian-announce, or debian-devel-announce (and will be announced on one of them eventually even if not by the person closest to the events). Forcing every DD to read a lot of blog posts is not in the best interests of the project. Now we could create a separate Planet installation for such things, there is already Debian Times [2] and Debian Administration [3] which serve as a proof of concept. If there was a Planet installation for important stuff related to Debian which had it’s content syndicated in various ways (including an email gateway – Feedburner.com provides a quite useful one) then requesting that everyone read it’s content in some way (either by web browsing, an RSS feed reader, syndication in another Planet, email, or something else) would not be unreasonable. The volume of posts on such a Planet would be quite small (similar to the current announcement mailing lists) so if received by email it wouldn’t fill anyone’s mailbox and if people visited the web site they would only need to do so every second month if that suited them.

The issue of what types of posts are suitable for Planet Debian is probably going to get raised again soon as a result of this.

3

Making Linux DVDs

Anthony Towns writes about using an improved version of jigdo to download CD/DVD images [1]. His improvement is basically to pipeline operation for better performance.

Jigdo (the Jigsaw download) is a tool to download a set of files and then use them to create a CD or DVD image [2]. The idea is that most web sites that have CD or DVD images also have a collection of files which comprise the DVD image available. This removes the need to store the data twice (wasting disk space on mirrors and in some situations breaking web caching).

I have never used jigdo, and for all Debian installations in recent times (the last few years at least) I download a small net-inst CD image and then have it download the other files from the net. I have Squid set up to cache large objects so this doesn’t waste too much of my precious network bandwidth (which is limited and expensive in Australia).

Now I’m thinking about what the optimum method for doing installs might be. One thing that would be good would be to support multiple repositories, the packages have unique file names and checksums so it should be possible to check one repository and then check another if it’s not working. I don’t mean multiple “deb” lines in the APT configuration. What I would like to do is to have an NFS file server or web server with an archive of downloaded packages and have APT check there first before downloading a file. So APT could get a list of packages from the net and then get the actual files locally if they are available.

The next thing that would be good is the ability to create a CD or DVD image dynamically and to store all temporary files. So I could download files from the repository and create a DVD image with just the packages that I need. Every time I create a DVD image my sub-set of the Debian archives would increase and the number of files actually downloaded in the creation process would be reduced. The effect would be to incrementally create a local mirror of the Debian repository.

Then I would like to see a two-stage DVD install process. I would like to boot from a CD or DVD and start the install and then have it give a list of files needed (which could be stored on a USB device or floppy) to create further CDs or DVDs for installation. One situation where this could have been beneficial was when I was doing an emergency install of CentOS. I did the first part of the install (selecting packages etc) to determine which CDs were needed. It turned out that almost all CDs were needed even though some of the CDs had only a few files that I was installing. If the installer could have written a list of packages to a USB device then I could have downloaded just those packages and got the install working a lot sooner. It seems to me that it’s fairly common to do one test install and then do some dozens of other installs with the same settings. So the ability to create a DVD of exactly the needed files for the other dozens of installs would be a great benefit.

Now this is just random commentary, unfortunately don’t have time to do any coding on this. But it seems obvious that something has to be done to improve the situation for developers and IT staff who need some degree of mirroring of the Debian package pool but who can’t do a full mirror. Back in 1996 I was able to mirror Debian over a 28K8 modem link and fit it on what was a reasonable hard drive by the standards of the day (incredibly tiny by today’s standards). Now I can’t practically mirror Debian over an Australian cable broadband connection and even by the standards of about 4 years ago (the age of most of my hard disks) the space requirements are significant.

I hope this post helps inspire some interest in developing these features. As delays in joining Debian [3] is the topic of the day it should be noted that work on preparing DVD images can easily be done by people who are not DDs. Such software should work from Debian archives without requiring any changes to them, and thus nothing special is needed from the Debian project to start work.

4

Motivation and Perspective

Patrick Winnertz writes about the demotivating effect of unreasonable delays on joining the Debian project [1].

While I agree that things need to be improved in terms of getting people in the project in a timely manner (the suggestion of providing assistants seems good), I don’t think that anyone has a good reason for being demotivated because of this.

I first applied to join Debian in late 1998 or some time in 1999. At the time part of the process of joining was to receive a phone call. At the time I was living in a hotel and they refused to call me on such a line. I could have easily camped out in the hallway of a hotel (the cheap London hotels often had a pay-phone in the hall and no phones in the rooms) and pretended that it was my own phone with an unlisted number. Unless they refused to allow people with unlisted numbers to join (which seems unlikely) then I could have joined then. So it seems that at the time I could only join Debian if I was prepared to lie about my ownership of a phone line.

I wasn’t overly bothered by this – there has never been a shortage of free software projects that need contributions of code. By late 2000 the rules had changed and I joined without needing a phone call. In the mean time I had forked the Bonnie storage benchmark program to form my own project Bonnie++ [2], created Postal – mail server benchmark suite [3] and worked on many other things as well.

I have sympathy for the people who apply to become Debian Developers and who have to wait a long time, I’ve been in the same situation myself. But there are plenty of things that you can do in the mean time. Some of the things that you can do are upstream development work, filing bug reports, submiting patches that fix bugs, and writing documentation (all forms including blog posts). Also when projects aren’t yet in Debian it often happens that someone creates unofficial packages, the person who does this doesn’t need to be a DD. Producing back-ported packages for new versions of programs that are in a stable release can also be done by people who are not DDs. Unofficial and back-ported packages provide less benefit for the project as a whole but considerable benefit for the people who want to use them.

There is a lot of work that can be done to fulfill clause 4 of the Debian Social Contract [4] (Our priorities are our users and free software) which doesn’t require being a Debian developer. It seems to me that if you have the right approach to this and maintain the perspective that Debian is one part of the free software community (and not necessarily the biggest or most significant) then a delay in your application to become a DD won’t be particularly demotivating.