Archives

Categories

ISP Redundancy and Virtualisation

If you want a reliable network then you need to determine an appropriate level of redundancy. When servers were small and there was no well accepted virtual machine technology there were always many points at which redundancy could be employed.

A common example is a large mail server. You might have MX servers to receive mail from the Internet, front-end servers to send mail to the Internet, database or LDAP servers (of which there is one server for accepting writes and redundant slave servers for allowing clients to read data), and some back-end storage. The back-end storage is generally going to lack redundancy to some degree (all the common options involve mail being stored in one location). So the redundancy would start with the routers which direct traffic to redundant servers (typically a pair of routers in a failover configuration – I would use OpenBSD boxes running CARP if I was given a choice in how to implement this [1], in the past I’ve used Cisco devices).

The next obvious place for redundancy is for the MX servers (it seems that most ISPs have machines with names such as mx01.example.net to receive mail from the Internet). The way that MX records are used in the DNS means that there is no need for a router to direct traffic to a pair of servers, and even a pair of redundant routers is another point of failure so it’s best to avoid them where possible. A smaller ISP might have two MX machines that are used for both sending outbound mail from their users (which needs to go through a load-balancing router) as well as inbound mail. A larger ISP will have two or more machines dedicated to receiving mail and two or more machines dedicated to sending mail (when you scan for viruses on both sent and received mail it can take a lot of compute power).

Now the database or LDAP servers used for storing user account data is another possible place for redundancy. While some database and LDAP servers support multi-master operation a more common configuration is to have a single master and multiple slaves which are read-only. This means that you want to have more slaves than are really required so that you can lose one without impacting the service.

There are several ways of losing a server. The most obvious is a hardware failure. While server class machines will have redundant PSUs, RAID, ECC RAM, and a general high quality of hardware design and manufacture, they still have hardware problems from time to time. Then there are a variety of software related ways of losing a server, most of which stem from operator error and bugs in software. Of course the problem with the operator errors and software bugs is that they can easily take out all redundant machines. If an operator mistakenly decides that a certain command needs to be run on all machines they will often run it on all machines before realising that it causes things to go horribly wrong. A software bug will usually be triggered by the same thing on all machines (EG I’ve had bad data written to a master LDAP server cause all slaves to crash and had a mail loop between two big ISPs take out all front-end mail servers).

Now if you have a mail server running on a virtual platform such that the MX servers, the mail store, and the database servers all run on the same hardware then redundancy is very unlikely to alleviate hardware problems. It’s difficult to imagine a situation where a hardware failure takes out one DomU while leaving others running.

It seems to me that if you are running on a single virtual server there is no benefit in having redundancy. However there is benefit in having an infrastructure which supports redundancy. For example if you are going to install new software on one of the servers there is a possibility that the software will fail. Doing upgrades and then having to roll them back is one of the least pleasant parts of sys-admin work, not only is it difficult but it’s also unreliable (new software writes different data to shared files and you have to hope that the old version can cope with them).

To implement this you need to have a Dom0 that can direct traffic to multiple redundant servers for services which only have a single server. Then when you need to upgrade (be it the application or the OS) you can configure a server on the designated secondary address, get it running, and then disable traffic to the primary server. If there are any problems you can direct traffic back to the primary server (which can be done much more quickly than downgrading software). Also if configured correctly you could have the secondary server be accessible from certain IP addresses only. So you could test the new version of the software using employees as test users while customers use the old version.

One advantage a virtual machine environment for load balancing is that you can have as many virtual Ethernet devices as you desire and you can configure them using software (without changing cables in the server room). A limitation on the use of load-balancing routers is that traffic needs to go through the router in both directions. This is easy for the path from the Internet to the server room and the path from the server room to the customer network. But when going between servers in the server room it’s a problem (which is not insurmountable, merely painful and expensive). Of course there will be a cost in CPU time for all the extra routing. If instead of having a single virtual ethernet device for all redundant nodes you have a virtual ethernet device for every type of server and use the Dom0 as a router you will end up doubling the CPU requirements for networking without even considering the potential overhead of the load balancing router functionality.

Finally there is a significant benefit in virtual machines for reliability of services. That is the ability to perform snapshot backups. If you have sufficient disk space and IO capacity you could have a snapshot of your server taken every day and store several old snapshots. Of course doing this effectively would require some minor changes to the configuration of machines to avoid unnecessary writes, this would include not compressing old log files and using a ram disk for /tmp and any other filesystem with transient data. When you have snapshots you can then run filesystem analysis tools on the snapshots to detect any silent corruption that may be occurring and give the potential benefit of discovering corruption before it gets severe (but I have yet to see a confirmed report of this saving anyone). Of course similar snapshot facilities are available on almost every SAN and on many NAS devices, but there are many sites that don’t have the budget to use such equipment.

Letter Frequency in Account Names

It’s a common practice when hosting email or web space for large numbers of users to group the accounts by the first letter. This is due to performance problems on some filesystems with large directories and due to the fact that often a 16bit signed integer is used for the hard link count so that it is impossible to have more than 32767 subdirectories.

I’ve just looked at a system I run (Bluebottle anti-spam email service [1]) which has about half a million accounts and counted the incidence of each first letter. It seems that S is the most common at almost 10% and M and A aren’t far behind. Most of the clients have English as their first language, naturally the distribution of letters would be different for other languages.

Now if you were to have a server with less than 300,000 accounts then you could probably split them based on the first letter. If there were more than 300,000 accounts then you would face the risk of having there be too many account names starting with S. See the table below for the incidences of all the first letters.

The two letter prefix MA comprised 3.01% of the accounts. So if faced with a limit of 32767 sub-directories then if you split by two letters then you might expect to have no problems until you approached 1,000,000 accounts. There were a number of other common two-letter prefixes which also had more than 1.5% of the total number of accounts.

Next I looked at the three character prefixes and found that MAR comprised 1.06% of all accounts. This indicates that splitting on the first three characters will only save you from the 32767 limit if you have 3,000,000 users or less.

Finally I observed that the four character prefix JOHN (which incidentally is my middle name) comprised 0.44% of the user base. That indicates that if you have more than 6,400,000 users then splitting them up among four character prefixes is not necessarily going to avoid the 32767 limit.

It seems to me that the benefits of splitting accounts by the first characters is not nearly as great as you might expect. Having directories for each combination of the first two letters is practical I’ve seen directory names such as J/O/JOHN or JO/JOHN (or use J/O/HN or JO/HN if you want to save directory space). But it becomes inconvenient to have J/O/H/N and the form JOH/N will have as many as 17,576 subdirectories for the first three letters which may be bad for performance.

This issue is only academic as far as most sys-admins won’t ever touch a system with more than a million users. But in terms of how you would provision so many users, in the past the limits of server hardware were approached long before these issues. For example in 2003 I was running some mail servers on 2RU rack mounted systems with four disks in a RAID-5 array (plus one hot-spare) – each server had approximately 200,000 mailboxes. The accounts were split based on the first two letters, but even if it had been split on only one letter it would probably have worked. Since then performance has improved in all aspects of hardware. Instead of a 2RU server having five 3.5″ disks it will have eight 2.5″ disks – and as a rule of thumb increasing the number of disks tends to increase performance. Also the CPU performance of servers has dramatically increased, instead of having two single-core 32bit CPUs in a 2RU server you will often have two quad-core 64bit CPUs – more than four times the CPU performance. 4RU machines can have 16 internal disks as well as four CPUs and therefore could probably serve mail for close to 1,000,000 users.

While for reliability it’s not the best idea to have all the data for 1,000,000 users on internal disks on a single server (which could be the topic of an entire series of blog posts), I am noting that it’s conceivable to do so and provide adequate performance. Also of course if you use one of the storage devices that supports redundant operation (exporting data over NFS, iSCSI, or Fiber Channel) then if things are configured correctly then you can achieve considerably more performance and therefore have a greater incentive to have the data for a larger number of users in one filesystem.

Hashing directory names is one possible way of alleviating these problems. But this would be a little inconvenient for sys-admin tasks as you would have to hash the account name to discover where it was stored. But I guess you could have a shell script or alias to do this.

Here is the list of frequency of first letters in account names:

First Letter Percentage
a 7.65
b 5.86
c 5.97
d 5.93
e 2.97
f 2.85
g 3.57
h 3.19
i 2.21
j 6.09
k 3.92
l 3.91
m 8.27
n 3.15
o 1.44
p 4.82
q 0.44
r 5.04
s 9.85
t 5.2
u 0.85
v 1.9
w 2.4
x 0.63
y 0.97
z 0.95

The Cost of Owning a Car

There has been a lot of talk recently about the cost of petrol, Colin Charles is one of the few people to consider the issue of wages in this discussion [1]. Unfortunately almost no-one seems to consider the overall cost of running a vehicle.

While I can’t get the figures for Malaysia (I expect Colin will do that) I can get them for Australia. First I chose a car that’s cheap to buy, reasonably fuel efficient (small) and common (cheap parts from the wreckers) – the Toyota Corolla seemed like a good option.
Continue reading The Cost of Owning a Car

What is Appropriate Advertising?

Colin Charles writes about a woman who is selling advertising space on herself [1]. Like Colin I haven’t bought a t-shirt in about 9 years (apart from some Cafepress ones I designed myself). So it seems that the price for getting some significant advertising at a computer conference is to buy a few hundred t-shirts (they cost $7 each when buying one at a time from Cafepress, I assume that the price gets lower than $3 each when buying truck-loads). I have been given boxer-shorts and socks with company logos on them (which I never wore), I think that very few people will show their underwear to enough people to make boxer-shorts a useful advertising mechanism, socks would probably work well in Japan though.

It seems to me that many people regard accepting free t-shirts as being an exception to all the usual conventions regarding advertising. Accepting gifts from companies that you do business with is generally regarded as a bad idea, except of course when t-shirts and other apparel are given out then it’s OK. Being paid to wear a placard advertising a product is regarded as degrading by many people, but accepting a free t-shirt (effectively being paid $7 for wearing advertising) is regarded as OK by almost everyone.

I don’t mind being a walking advert for a company such as Google. I use many Google products a lot and I can be described as a satisfied customer. There are some companies that have given me shirts which I only wear in winter under a jumper. The Oracle Unbreakable Linux [2] shirt is one that I wear in winter.

Now I would not consider accepting an offer to have advertising on my butt (although I’m pretty sure that it doesn’t get enough attention that anyone would make such an offer). I would however be happy to talk with someone who wants to pay me to wear a t-shirt with advertising when giving a lecture at a conference. I am not aware of any conference which has any real dress requirement for speakers (apart from the basic idea of not offending the audience). The standard practice is that if your employer pays you to give a lecture as part of their marketing operation then they give you a shirt to wear (polo more often than t-shirt). I am currently working on some things which could end up as papers for presentation at Linux conferences. If someone wanted to sponsor my work on one of those free software related projects and then get the recognition of having me wear their shirt while giving a lecture and have me listed as being sponsored by that company in the conference proceedings then that seems like a reasonable deal for everyone.

One thing that you need to keep in mind when accepting or soliciting for advertising is the effect it has on your reputation. Being known as someone who wants advertising on their butt probably wouldn’t be fun for very long.

On the Internet advertising seems to be almost everywhere. It seems that more than half the content on the net (by the number of pages or by the number of hits) either has an option to donate (as Wikipedia does and some blogs are starting to do), has Google advertising (or a similar type of adverts from another company), is a sales site (IE you can buy online), or is a marketing site (IE provides background information and PR to make you want to buy at some other time). Note that my definition of advertising is quite broad, for example the NSA web site [3] has a lot of content that I regard as advertising/marketing – with the apparent aim of encouraging skilled people to apply for jobs. Not that I’m complaining, I’ve visited the National Cryptologic Museum [4] several times and learned many interesting things!

I think that Internet advertising that doesn’t intrude on the content (IE no pop-ups, page diversions, or overly large adverts) is fine. If the advertising money either entirely pays people to produce useful content or simply encourages them to do so (as in the case of all the blogs which earn $10 a month) then I’m happy with that. I have previously written about some of my experience advertising on my blog [5] and how I encourage others to do the same.

I don’t think that space on a t-shirt is any more or less appropriate for advertising than space on a web site hosting someone’s blog.

Finally there is one thing I disagree with in Colin’s post, that is the use of the word “whore“. It’s not uncommon to hear the term “whoring” used as a slang term for doing unreasonable or unworthy things to make money (where “unreasonable” and “unworthy” often merely means doing something that the speaker wouldn’t be prepared to do). But using the term when talking about a woman is quite likely to cause offense and is quite unlikely to do any good. The Wikipedia page about prostitution [6] has some interesting background information.

I’m Skeptical about Robotic Nanotech

There has been a lot of fear-mongering about nanotech. The idea is that little robots will eat people (or maybe eat things that we depend on such as essential food crops). It’s unfortunate that fear-mongering has replaced thought and there seems to have been little serious discussion about the issues.

If (as some people believe) nanotech has the potential to be more destructive than nuclear weapons then it’s an issue that needs to be discussed in debates before elections and government actions to alleviate the threat need to be reported on the news – as suggested in the Accelerating Future blog [0].

I predict that there will be three things which could be called nanotech in the future:

  1. Artifical life forms as described by Craig Venter in his talk for ted.com [1]. I believe that these should be considered along with nanotech because the boundary between creatures and machines can get fuzzy when you talk about self-replicating things devised by humans which are based on biological processes.
    I believe that artificial life forms and tweaked versions of current life forms have significant potential for harm. The BBC has an interesting article on health risks of GM food which suggests that such foods should be given the same level of testing as pharmaceuticals [2]. But that’s only the tip of the iceberg, the potential use of Terminator Gene technology [3] in biological warfare seems obvious.
    But generally this form of nanotech has the same potential as bio-warfare (which currently has significantly under-performed when compared to other WMDs) and needs to be handled in the same way.
  2. The more commonly discussed robotic nanotech, self-replicating and which can run around to do things (EG work inside a human body). I doubt that tiny robots can ever be as effective at adapting to their environment as animals, I also doubt that they can self-replicate in the wild. Currently we create CPU cores (the most intricate devices created by humans) from very pure materials in “clean rooms”. Making tiny machines in clean-rooms is not easy, making them in dirty environments is going to be almost impossible. Robots as we know them are based around environments that are artificially clean not natural environments. Robots that can self-replicate in a clean-room when provided with pure supplies of the necessary raw materials is a solvable problem. I predict that this will remain in science-fiction.
  3. Tiny robots manufactured in factories to work as parts of larger machines. This is something that we are getting to today. It’s not going to cause any harm as long as the nano-bots can’t be manufactured on their own and can’t survive in the wild.

In summary, I think that the main area that we should be concerned about in regard to nano-bot technology is as a new development on the biological warfare theme. This seems to be a serious threat which deserves the attention of major governments.

Wyndham Resorts is a Persistent Spammer

Over the last week I have received five phone calls from Wyndham Resorts asking if I would like to be surveyed. Every time I tell them that I am not going to do their survey, on all but one call I had to repeatedly state that I would not do the survey for more than two minutes before they would go away.

The advantage of phone spam over email spam is that the caller pays, I guess that they have a time limit of three minutes when calling a mobile phone to save on calling costs.

There have been a number of proposals for making people pay for email to discourage spam. Even a cost of a few cents a message would make spam cease being economically viable for a mass audience (a smaller number of targeted spams would be easier to block or delete). But such plans to entirely change the way email works have of course failed totally.

But for phones it could work. I’d like to have a model where anyone who calls me has to pay an extra dollar per minute which gets added on to their phone bill. When people who I want to talk to call me I could reimburse them (or maybe be able to give my phone company a list of numbers not to bill).

This could also seamlessly transition to commercial use. I would be happy to accept calls from people asking for advice about Linux and networking issues for $1 per minute. With all the people who call me about such things for free already it would be good to answer some questions for money.

ECC RAM is more useful than RAID

A common myth in the computer industry seems to be that ECC (Error Correcting Code – a Hamming Code [0]) RAM is only a server feature.

The difference between a server and a desktop machine (in terms of utility) is that a server performs tasks for many people while a desktop machine only performs tasks for one person. Therefore when purchasing a desktop machine you can decide how much you are willing to spend for the safety and continuity of your work. For a server it’s more difficult as everyone has a different idea of how reliable a server should be in terms of uptime and in terms of data security. When running a server for a business there is the additional issue of customer confidence. If a server goes down occasionally customers start wondering what else might be wrong and considering whether they should trust their credit card details to the online ordering system.

So it is obviously apparent that servers need a different degree of reliability – and it’s easy to justify spending the money.

Desktop machines also need reliability, more so than most people expect. In a business when a desktop machine crashes it wastes employee time. If a crash wastes an hour (which is not unlikely given that previously saved work may need to be re-checked) then it can easily cost the business $100 (the value of the other work that the employee might have done). Two such crashes per week could cost the business as much as $8000 per year. The price difference between a typical desktop machine and a low-end workstation (or deskside server) is considerably less than that (when I investigated the prices almost a year ago desktop machines with server features ranged in price from $800 to $2400 [1]).

Some machines in a home environment need significant reliability. For example when students are completing high-school their assignments have a lot of time invested in them. Losing an assignment due to a computer problem shortly before it’s due in could impact their ability to get a place in the university course that they most desire! Then there is also data which is irreplaceable, one example I heard of was of a woman who’s computer had a factory pre-load of Windows, during a storm the machine rebooted and reinstalled itself to the factory defaults – wiping several years of baby photos… In both cases better backups would mostly solve the problem.

For business use the common scenario is to have file servers storing all data and have very little data stored on the PC (ideally have no data on the PC). In this case a disk error would not lose any data (unless the swap space was corrupted and something important was paged out when the disk failed). For home use the backup requirements are quite small. If a student is working on an important assignment then they can back it up to removable media whenever they reach a milestone. Probably the best protection against disk errors destroying assignments would be a bulk purchase of USB flash storage sticks.

Disk errors are usually easy to detect. Most errors are in the form of data which can not be read back, when that happens the OS will give an error message to the user explaining what happened. Then if you have good backups you revert to them and hope that you didn’t lose too much work in the mean-time (you also hope that your backups are actually readable – but that’s another issue). The less common errors are lost-writes – where the OS writes data to disk but the disk doesn’t store it. This is a little more difficult to discover as the drive will return bad data (maybe an old version of the file data or maybe data from a different file) and claim it to be good.

The general idea nowadays is that a filesystem should check the consistency of the data it returns. Two new filesystems, ZFS from Sun [2] and BTRFS from Oracle [3] implement checksums of data stored on disk. ZFS is apparently production ready while BTRFS is apparently not nearly ready. I expect that from now on whenever anyone designs a filesystem for anything but the smallest machines (EG PDAs and phones) they will include data integrity mechanisms in the design.

I believe that once such features become commonly used the need for RAID on low-end systems will dramatically decrease. A combination of good backups and knowing when your live data is corrupted will often be a good substitute for preserving the integrity of the live data. Not that RAID will necessarily protect your data – with most RAID configurations if a hard disk returns bad data and claims it to be good (the case of lost writes) then the system will not read data from other disks for checksum validation and the bad data will be accepted.

It’s easy to compute checksums of important files and verify them later. One simple way of doing so is to compress the files, every file compression program that I’ve seen has some degree of error detection.

Now the real problem with RAM which lacks ECC is that it can lose data without the user knowing. There is no possibility of software checks because any software which checks for data integrity could itself be mislead by memory errors. I once had a machine which experienced filesystem corruption on occasion, eventually I discovered that it had a memory error (memtest86+ reported a problem). I will never know whether some data was corrupted on disk because of this. Sifting through a large amount of stored data for some files which may have been corrupted due to memory errors is almost impossible. Especially when there was a period of weeks of unreliable operation of the machine in question.

Checking the integrity of file data by using the verify option of a file compression utility, fsck on a filesystem that stores checksums on data, or any of the other methods is not difficult.

I have a lot of important data on machines that don’t have ECC. One reason is that machines which have ECC cost more and have other trade-offs (more expensive parts, more noise, more electricity use, and the small supply makes it difficult to get good deals). Another is that there appear to be no laptops which support ECC (I use a laptop for most of my work). On the other hand RAID is very cheap and simple to implement, just buy a second hard disk and install software RAID – I think that all modern OSs support RAID as a standard installation option. So in spite of the fact that RAID does less good than a combination of ECC RAM and good backups (which are necessary even if you have RAID), it’s going to remain more popular in high-end desktop systems for a long time.

The next development that seems interesting is the large portion of the PC market which is designed not to have the space for more than one hard disk. Such compact machines (known as Small Form Factor or SFF) could easily be designed to support ECC RAM. Hopefully the PC companies will add reliability features in one area while removing them in another.

Perpetual Motion

It seems that many blog posts related to fuel use (such as my post from yesterday about record oil prices [1]) are getting adverts about perpetual motion [2]. Note that the common usage of the term “Perpetual Motion” does not actually require something to move. A battery that gives out electricity forever would be regarded as fitting the description, as does any power source which doesn’t have an external source of energy.

The most common examples of this are claims about Oxyhydrogen [3], this is a mixture of hydrogen and oxygen in a 2:1 ratio. The wikipedia page is interesting, apparently oxyhydrogen is used for welding metals, glass, and plastics, and it was also used to heat lime to provide theatrical lighting (“lime light”). So a mixture of hydrogen and oxygen does have real-world uses.

The fraud comes in the issue of the claims about magnecules [4]. Magnecules are supposedly the reason for the “atomic” power of HHO gas (AKA Oxyhydrogen) which are repeated on many web sites. In brief, one mad so-called scientist (of course if he was a real scientist he would have experimental evidence to support his claims and such experiments would be repeatable) has invented entirely new areas of science, one of which involves magnetic bonds between atoms. He claims that such chemical species can be used to obtain free energy. The idea is that you start with water, end with water plus energy – then reuse the water in a closed system. Strangely the web sites promoting water fueled cars don’t seem to mention magnecules and just leave the “atomic energy” claim with no support – maybe magnecules are simply too crazy for them.

The water fuelled car wikpedia page is interesting – it lists five different ways that water can actually be used in a car engine (which are based on sound scientific principles and which have been tested) and compares them to the various water fueled car frauds [5].

I’m not accepting any more comments on my blog about perpetual motion solutions to the petrol crisis (they just take up valuable space and distract people who want to discuss science). I’ll allow some comments about such things on this post though.

Record Oil Prices

MarketWatch reports that oil prices had the biggest daily gain on record, going up $11 in one day.

They claim that this is due to an impending Israeli attack on Iran and a weak US economy. $150 per barrel is the price that they predict for the 4th of July. That’s an interesting choice of date, I wonder whether they will be talking about “independence from Arabian oil”…

The New York Times has an interesting article on fuel prices [1]. Apparently sales of SUVs are dropping significantly.

The US senate is now debating a cap on carbon-dioxide production. The NY Times article suggests that if the new “carbon taxes” could be combined with tax cuts in other areas. If implemented correctly it would allow people who want to save money to reduce their overall tax payments by reducing fuel use. Also as increasing prices will decrease demand (thus decreasing the price at import time) it would to some degree mean transferring some revenue from the governments of the middle east to the US government.

The article also states that the Ford F series of “pickup trucks” was the most popular line of vehicles in the US for more than 20 years! But last month they were beaten by the Toyota Corolla and Camry and the Honda Civic and Accord. Now Ford needs to put more effort into their medium to large cars. With the hybrid Camry apparently already on sale in the US (their web site refuses to provide any information to me because I don’t have Flash installed so I can’t check) and rumored to be released soon in other countries Ford needs to put some significant amounts of effort into developing fuel efficient vehicles.

According to a story in the Herald Sun (published on the 23rd of April), survey results show that 1/3 of Victorians would cease using their car to get to work if the petrol price reached $1.75/L [2]. Now the Herald Sun has run a prediction (by the assistant treasurer and the NRMA) that $1.75/L will be reached next week (an increase of just over 10 cents a liter) [3].

The good news is that there will be less pollution in Australia in the near future (even if $1.75 is not reached I am certain that the price will increase enough to encourage some people to use public transport). The bad news is that our public transport is inadequate at the moment and there will be significant levels of overcrowding.

SE Linux Support in GPG

In May 2002 I had an idea for securing access to GNUPG [1]. What I did was to write SE Linux policy to only permit the gpg program to access the secret key (and other files in ~/.gnupg). This meant that the most trivial ways of stealing the secret key would be prevented. However an attacker could still use gpg to encrypt it’s secret key and write the data to some place that is accessible, for example the command “gpg -c --output /tmp/foo.gpg ~/.gnupg/secring.gpg“. So what we needed was for gpg to either refuse to encrypt such files, or to spawn a child process for accessing such files (which could be granted different access to the filesystem). I filed the Debian bug report 146345 [2] requesting this feature.

In March upstream added this feature, the Debian package is currently not built with --enable-selinux-support so this feature isn’t enabled yet, but hopefully it will be soon. Incidentally the feature as currently implemented is not really SE Linux specific, it seems to me that there are many potential situations where it could be useful without SE Linux. For example if you were using one of the path-name based MAC systems (which I dislike – see what my friend Joshua Brindle wrote about them for an explanation [3]) then you could gain some benefits from this. A situation where there is even smaller potential for benefit is in the case of an automated system which runs gpg which could allow an attacker to pass bogus commands to it. When exploiting a shell script it might be easier to specify the wrong file to encrypt than to perform more sophisticated attacks.

When the feature in question is enabled the command “gpg -c --output /tmp/foo.gpg ~/.gnupg/secring.gpg” will abort with the following error:
gpg: can’t open `/root/.gnupg/secring.gpg’: Operation not permitted
gpg: symmetric encryption of `/root/.gnupg/secring.gpg’ failed: file open error

Of course the command “gpg --export-secret-keys” will abort with the following error:
gpg: exporting secret keys not allowed
gpg: WARNING: nothing exported

Now we need to determine the correct way of exporting secret keys and modifying the GPG configuration. It might be best to allow exporting the secret keys when not running SE Linux (or other supported MAC systems), or when running in permissive mode (as in those situations merely copying the files will work). Although we could have an option in gpg.conf for this for the case where we want to prevent shell-script quoting hacks.

For editing the gpg.conf file and exporting the secret keys we could have a program similar in concept to crontab(1) which has PAM support to determine when it should perform it’s actions. Also it seems to me that crontab(1) could do with PAM support (I’ve filed Debian bug report 484743 [4] requesting this).

Finally one thing that should be noted is that the targeted policy for SE Linux does not restrict GPG (which runs in the unconfined_t domain). Thus most people who use SE Linux at the moment aren’t getting any benefits from such things. This will change eventually.