Archives

Categories

Could we have an Open Computing Cloud?

One of the most interesting new technologies that has come out recently is Cloud Computing, the most popular instance seems to be the Amazon EC2 (Elastic Cloud Computing). I think it would be good if there were some open alternatives to EC2.

Amazon charges $0.10 per compute hour for a virtual machine that has one Compute Unit (equivalent to a 1.0 to 1.2GHz 2007 Opteron core) and 1.7G of RAM. Competing with this will be difficult as it’s difficult to be cheaper than 10 cents an hour ($876.60 per annum) to a sufficient extent to compensate for the great bandwidth that Amazon has on offer.

The first alternative that seems obvious is a cooperative model. In the past I’ve run servers for the use of friends in the free software community. It would be easy for me to do such things in future, and Xen makes this a lot easier than it used to be. If anyone wants a DomU for testing something related to Debian SE Linux then I can set one up in a small amount of time. If there was free software to manage such things then it wuld be practical to have some sort of share system for community members.

The next possibility is a commercial model. If I could get Xen to provide a single Amazon Compute Unit to one DomU (not less or more) then I wouldn’t notice it on some of my Xen servers. 1.7G of RAM is a moderate amount, but as 3G seems to be typical for new desktop systems (Intel is still making chipsets that support a maximum of 4G of address space [2], when you subtract the address space for video and PCI you might as well only get 3G) it would not be inconceivable to use 1.7G DomUs on idle desktop machines. But it’s probably more practical to have a model with less RAM. For my own use I run a number of DomUs with 256M of RAM for testing and development and the largest server DomU I run is 400M (that is for ClamAV, SpamAssassin, and WordPress). While providing 1.7G of RAM and 1CU for less than 10 cents an hour may be difficult, but providing an option of 256M of RAM and 0.2CU (burstable to 0.5CU) for 2 cents an hour would give the same aggregate revenue for the hardware while also offering a cheaper service for people who want that. 2 cents an hour is more than the cost of some of the Xen server plans that ISPs offer [3] but if you only need a server for part of the time then it would have the potential to save some money.

For storage Amazon has some serious bandwidth inside it’s own network for transferring the image to the machine for booting. To do things on the cheap the way to go would be to create a binary diff of a common image. If everyone who ran virtual servers had images of the common configurations of the popular distributions then creating an image to boot would only require sending a diff (maybe something based on XDelta [4]). Transferring 1GB of filesystem image over most network links is going to be unreasonably time consuming, transferring a binary diff of an up to date CentOS or Debian install vs a usable system image based on CentOS or Debian which has all the updates applied is going to be much faster.

Of course something like this would not be suitable for anything that requires security. But there are many uses for servers that don’t require much security.

Links September 2008

RAM is the new Disk [1] – interesting post about using a distributed network of servers with RAM for main storage. The concept is that RAM on another machine can be accessed faster than local disk and that disk performance for contiguous IO has been increasing steadily at a greater rate than random seek performance. So disks can be used in a similar manner to tape drives (backing up data) and RAM can be used as main storage (2RU machines with 1TB of RAM are predicted soon). It is interesting to contrast this with the Rubik’s Cube solver which uses disk as RAM [2].

Interesting Sun Blog post about how ZFS uses flash based SSD (Solid State Disk) for read caching [3].

Yahoo has released a Firefox plugin named Yslow to tell you why your web site is slow [4]. This complements the Yahoo Best Practices for Speeding Up Your Web Site document [5].

Amsterdam now has cargo trams [6]. This reduces pollution and traffic congestion while allowing efficient transport of cargo. The cargo trams tend to follow passenger trams so they don’t provide any additional obstacle to traffic.

Interesting article by Cory Doctorow about how the standard formats limit creative content [7]. He makes the unfortunate mistake of claiming that “no one tries to make sitcoms about stories that take five minutes to tell” and “no one tries to make feature films about subjects that take 30 seconds to elucidate” – I think that the majority of sitcom episodes can be described in considerably less than 5 minutes and a significant number of movies (EG every one that stars Jean-Claude Van Damme [8]) can have their plot described in less than 30 seconds.

Never Trust a DRM Vendor

I was reading an interesting post about predicting the results of the invasion of Iraq [1]. One of the points made was that the author rejected every statement by a known liar (which includes all the world leaders who wanted the invasion). So basically regarding every statement by a known liar as potentially a lie led to a better than average prediction.

It seems to me that a similar principle can be applied to other areas. A vendor or advocate of DRM (Digital Restrictions Management) [2] has already proven that they are not concerned about the best interests of the people who own the computers in question (who paid for the hardware and software) – they are in fact interested in preventing the owner of the computer from using it in ways that they desire. This focus on security in opposition to the best interests of the user can not result in a piece of software that protects the user’s best interests.

I believe that any software with DRM features (other than those which are requested by people who purchase the software) should be considered to be malware. The best example of this is the Sony BMG copy protection scandal [3].

Note that I refer to the interests of the people who own the computers – not those who use them. It is quite acceptable to write software that serves the buyer not the end-user. Two examples are companies that want to prevent their employees doing “inappropriate” things and parents who want to restrict what their children can do. We can debate about what restrictions on the actions of employees and children are appropriate, but it seems clear that when software is purchased for their use the purchaser has the right to restrict their access.

Also note that when I refer to someone owning a computer I refer to an honest relationship (such as parent/child or employer/worker) not the abusive relationships that mobile phone companies typically have with their customers (give the customer a “free” phone but lock it down and claim that they don’t own it). Mobile phones ARE computers, just rather limited ones at this time.

My Prediction for the iPhone

I have previously written about how I refused an offer of a free iPhone [1] (largely due to it’s closed architecture). The first Google Android phone has just been announced, the TechCrunch review is interesting – while the built-in keyboard is a nice feature the main thing that stands out is the open platform [2]. TechCrunch says “From now on, phones need to be nearly as capable as computers. All others need not apply“.

What I want is a phone that I control, and although most people don’t understand the issues enough to say the same, I think that they will agree in practice.

In the 80’s the Macintosh offered significant benefits over PCs, but utterly lost in the marketplace because it was closed (less available software and less freedom). Due to being used in Macs and similar machines the Motorolla 68000 CPU family also died out, and while it’s being used in games consoles and some other niche markets the PPC CPU family (the next CPU used by Apple) also has an uncertain future. The IBM PC architecture evolved along with it’s CPU from a 16bit system to a 64bit system and took over the market because it does what users want it to do.

I predict that the iPhone will be just as successful as the Macintosh OS and for the same reasons. The Macintosh OS still has a good share of some markets (it has traditionally been well accepted for graphic design and has always provided good hardware and software support for such use), and is by far the most successful closed computer system, but it has a small part of the market.

I predict that the iPhone will maintain only a small share of the market. There will be some very low-end phones that have the extremely closed design that currently dominates the market, and the bulk of the market will end up going with Android or some other open phone platform that allows users to choose how their phone works. One issue that I think will drive user demand for control over their own phones is the safety issues related to child use of phones (I’ve written about this previously [3]). Currently phone companies don’t care about such things – the safety of customers does not affect their profits. But programmable phones allows the potential for improvements to be made without involving the phone company – while with iPhone you have Apple as the roadblock.

Now having a small share of the mobile phone market could be very profitable, just as the small share of the personal computer market is quite profitable for Apple. But it does mean that I can generally ignore them as they aren’t very relevant in the industry.

OpenID Delegation

I’ve just installed Eran Sandler’s OpenID Delegation Plugin [1]. This means that I can now use my blog URL for OpenID authentication. I’ve also included the plugin in my WordPress repository (which among other things has the latest version of WordPress). One thing that I consider to be a bug in Eran’s plugin is the fact that it only adds the OpenID links to the main URL. This means that for example if I write a blog comment and want to refer to one of my own blog posts on the same topic (which is reasonably common – after more than two years of blogging and almost 700 posts I’ve probably written a post that is related to every topic I might want to comment on) then I can’t put the comment in the URL field. The problem here is that URLs in the body of a blog comment generally increase the spam-score (I use this term loosely to refer to a variety of anti-spam measures – I am not aware of anything like SpamAssassin being used on blog comments), and not having OpenID registration also does the same. So it seems that with the current functionality of Eran’s plugin I will potentially suffer in some way any time I want to enter a blog comment that refers to a particular post I wrote.

deb http://www.coker.com.au etch wordpress

My WordPress Debian repository is currently available with the above APT repository. While it specifies etch it works with Lenny too (my blog currently runs on Lenny). I will eventually change it to use lenny in the name.

For the OpenID server I am currently using the OpenID service provided by Yubico as part of the support for their Yubikey authentication token [2] (of which I will write more at a later date). I think that running their own OpenID server was a great idea, it doesn’t cost much to run such a service and it gives customers an immediate way of using their key. I expect that there are more than a few people who would be prepared to buy a Yubikey for the sole purpose of OpenID authentication and signing in to a blog server (which can also be via OpenID if you want to do it that way). I plan to use my Yubikey for logging in to my blog, but I still have to figure out the best way of doing it.

One thing that has been discussed periodically over the years has been the topic of using smart-cards (or some similar devices) for accessing Debian servers and securing access to GPG keys used for Debian work by developers who are traveling. Based on recent events I would hazard a guess that such discussions are happening within the Fedora project and within Red Hat right now (if I worked for Red Hat I would be advocating such things). It seems that when such an idea is adopted a logical extension is to support services that users want such as OpenID at the same time, if nothing else it will make people more prone to use such devices.

Disclaimer: Yubico gave me a free Yubikey for the purpose of review.

Update: The OpenIDEnabled.com tool to test OpenID is useful when implementing such things [3].

Now Using OpenID

When this post goes live I will have had OpenID running on my blog for 24 hours.

My first attempt to do so was not successful. The theme I use does not support the option of displaying that the website URL is checked for OpenID (a feature of the WordPress OpenID Plugin [1]. Someone who’s comments I desire complained that the unexpected prompt for an OpenID password when they entered a comment caused them to abort the submission process and I therefore lost their comment – so I immediately disabled the plugin. I think that person was being a little unreasonable, it seems to me that when you add OpenID to your web site you should be expecting it to be checked for such things! But in spite that I felt obliged to do what was necessary to avoid confusion.

Yesterday I reenabled OpenID, my first effort was to hack my theme to have a separate entry field for the OpenID URL which appears to work (this is documented in the OpenID plugin). The next step was to enable the URL to be used for OpenID and hack the theme to make it note how it’s being used. This appears to work well and should avoid the objections.

One factor that gave me an incentive to work on this is this post about Taking a Stand to Promote OpenID [2]. That’s the type of person who I would like to have commenting on my blog.

I’m also working on my own OpenID authentication solution and I may consider taking the same stand.

Things you can do for your LUG

A Linux Users Group like most volunteer organisations will often have a small portion of the membership making most of the contributions. I believe that every LUG has many people who would like to contribute but don’t know how, here are some suggestions for what you can do.

Firstly offer talks. Many people seem to believe that giving a talk for a LUG requires expert knowledge. While it is desired to get any experts in the area to share their knowledge, it is definitely not a requirement that you be an expert to give a talk. The only requirement is that you know more than the audience – and a small amount of research can achieve that goal.

One popular talk that is often given is “what’s new in Linux”. This is not a talk that requires deep knowledge, it does require spending some time reading the news (which lots of people do for fun anyway). So if you spend an average of 30 minutes a day every week day reading about new developments in Linux and other new technology, you could spend another minute a day (20 minutes a month) making notes and the result would be a 10 to 15 minute talk that would be well received. A talk about what’s new is one way that a novice can give a presentation that will get the attention of all the experts (who know their own area well but often don’t have time to see the big picture).

There are many aspects of Linux that are subtle, tricky, and widely misunderstood. Often mastering them is a matter that is more related to spending time testing than anything else. An example of this is the chmod command (and all the Unix permissions that are associated with it). I believe that the majority of Linux users don’t understand all the subtleties of Unix permissions (I have even seen an employee of a Linux vendor make an error in this regard while running a formal training session). A newbie who spent a few hours trying the various combinations of chmod etc and spoke about the results could give a talk that would teach something to almost everyone in the audience. I believe that there are many other potential talk topics of this nature.

One thing that is often overlooked when considering how to contribute to LUGs is the possibility of sharing hardware. We all have all the software we need for free but hardware still costs money. If you have some hardware that hasn’t been used for a year then consider whether you will ever use it again, if it’s not likely to be used then offer it to your LUG (either via a mailing list or by just bringing it to a meeting). Also if you see some hardware that is about to be discarded and you think that someone in your LUG will like it then grab it! In a typical year I give away a couple of car-loads of second-hand hardware, most of it was about to be thrown out by a client so I grab it for my local LUG. Taking such hardware reduces disposal costs for my clients, prevents computer gear from poisoning landfill (you’re not supposed to put it in the garbage but most people do), and helps random people who need hardware.

One common use for the hardware I give away is for children. Most people are hesitant to buy hardware specifically for children as it only takes one incident of playing with the switch labeled 240V/110V (or something of a similar nature) to destroy it. Free hardware allows children to get more access to computers at an early age.

Finally one way to contribute is by joining the committee. Many people find it difficult to attend meetings, so attending a regular meeting and a committee meeting every month is difficult. So if you have no problems in attending meetings then please consider contributing in this way.

An Update on DKIM Signing and SE Linux Policy

In my previous post about DKIM [1] I forgot to mention one critical item, how to get Postfix to actually talk to the DKIM milter. This wasn’t a bad thing because it turned out that I hadn’t got it right.

I had configured the DKIM milter on the same line as the milters for ClamAV and SpamAssassin – in the smtpd_milters section. This was fine for relaying outbound mail via my server but didn’t work for locally generated mail. For locally generated mail Postfix has a directive named non_smtpd_milters which you need to use. So it seems that a fully functional Postfix DKIM milter configuration requires adding the following two lines to /etc/postfix/main.cf:

smtpd_milters = unix:/var/run/dkim-filter/dkim-filter.sock
non_smtpd_milters = unix:/var/run/dkim-filter/dkim-filter.sock

This also required an update to the SE Linux policy. When I was working on setting up DKIM I also wrote SE Linux policy to allow it and also wrote policy for the ClamAV milter. That policy is now in Debian/Unstable and has been approved for Lenny. So I now need to build a new policy package that allows the non_smtpd_milter access to the DKIM milter and apply for it to be included in Lenny.

SE Linux in Lenny is going to be really good. I think that I’ve already made the SE Linux support in the pre-release (*) of Lenny significantly better than Etch plus all my extra updates is. More testers would be appreciated, and more people joining the coding would be appreciated even more.

(*) I use the term pre-release to refer to the fact that the Lenny repository is available for anyone to download packages.

Play Machine Update

My Play Machine [1] was offline for most of the past 48 hours (it’s up again now). I have upgraded the hardware for the Dom0 used to run it so that it now has the ability to run more DomU’s. I can now run at least 5 DomUs while previously I could only run 3. I have several plans that involve running multiple Play Machines with different configurations and for running SE Linux training.

The upgrade didn’t need to take two days, but I had some other things that diverted me during the middle of the job (running the Play Machine isn’t my highest priority). I’ve been doing some significant updates to the SE Linux policy for Lenny including some important changes to the policy related to mail servers. Among other things I created a new domain for DKIM (which I previously wrote about) [2]. The chain of dependencies was that a client wanted me to do some urgent DKIM work and I needed my own mail server to be a test-bed. I installed DKIM and then of course I had to write the SE Linux policy. Now that my client’s network is running the way it should be I’ve got a little more time for other SE Linux work.

Installing DKIM and Postfix in Debian

I have just installed Domain Key Identified Mail (DKIM) [1] on my mail server. In summary the purpose is to allow public-key signing of all mail that goes out from your domain so that the recipient can verify it’s authenticity (and optionally reject forgeries). It also means that you can verify inbound mail. A weakness of DKIM is that it is based on the DNS system (which has many issues and will continue to have them until DNSSEC becomes widely adopted). But it’s better than nothing and it’s not overly difficult to install.

The first thing to do before installing DKIM is to get a Gmail account. Gmail gives free accounts and does DKIM checks. If you use Iceweasel or another well supported browser then you can click on “Show Details” from the message view which then gives fields “mailed-by” and “signed-by” which indicate the DKIM status. If you use a less supported browser such as Konqueror then you have to click on the “Show Original” link to see the headers and inspect the DKIM status there (you want to see dkim=pass in the Authentication-Results header). Also Gmail signs outbound mail so it can be used to test verification of received mail.

The next thing to do is to install the DKIM Milter. To make things exciting this is packaged for Debian under the name dkim-filter so that reasonable searches for such functionality (such as a search for milter or dkim-milter – the upstream project name) will fail.

After installing the package you must generate a key, I used the command “dkim-genkey -d coker.com.au -s 2008” to generate a key for my domain. It seems that the domain is currently only used as a comment but I prefer to use all reasonable parameters for such things. The -s option is for a selector, which is a way of specifying multiple valid signing keys. It’s apparently fairly common to use a different key every year. But other options include having multiple mail servers for a domain and giving each one a selector. The dkim-genkey command produces two files, one is named 2008.txt and can be copied into a BIND zone file. The other is named 2008.private and is used by the DKIM signing server.

Here is a sample of the most relevant parts of the config file /etc/dkim-filter.conf for signing mail for a single domain:

Domain coker.com.au
KeyFile /etc/dkim/2008.private
Selector 2008

The file /etc/default/dkim-filter needs to be modified to specify how it will listen for connections from the MTA, I uncommented the line SOCKET=”local:/var/run/dkim-filter/dkim-filter.sock”.

One issue is that the Unix domain socket file will by default not be accessible to Postfix, I devised a work-around for this and documented it in Debian bug report #499364 [2] (I’ve hacked a chgrp command into the init script, ideally the GID would be an option in a config file).

A basic configuration of dkim-milter will sign mail for one domain. If you want to sign mail for more than one domain you have to comment out the configuration for a single domain in /etc/dkim-filter.conf and instead use the option KeyList file to specify a file with a list of domains (the dkim-filter.conf(5) man page documents this). The one confusing issue is that the selector is taken to be the basename of the file which contains the secret key (they really should have added an extra field). This means that if you have an obvious naming scheme for selectors (such as the current year) then you need a different directory for each domain to contain the key.

As an example here is the line from the KeyList file for my domain:
*@coker.com.au:coker.com.au:/etc/dkim/coker.com.au/2008

Now one problem that we have is that list servers will usually append text to the body of a message and thus break the signature. The correct way of solving this is to have the list server sign the mail it sends out and have a header indicating the signature status of the original message. But there are a lot of list servers that won’t be updated for a long time.

The work-around is to put the following line in /etc/dkim-filter.conf:
BodyLengths yes

This means that the signature will cover a specified number of bytes of body data, and any extra which is appended will be ignored when it comes time to verify the message. This means of course that a hostile third party could append some bogus data without breaking the signature. In the case of plain text this isn’t so bad, but when the recipient defaults to having HTML email it could have some interesting possibilities. I wonder whether it would be prudent to configure my MUA to always send both HTML and plain-text versions of my mail so that an attacker can’t append hostile HTML.

It’s a pity that Gmail (which appears to have the most popular implementation of DKIM) doesn’t allow setting that option. So far the only message I have received that failed DKIM checks was sent from a Gmail account to a Debian mailing list.

Ideally it would be possible to have the messages sent to mailing lists not be signed or have the length field used. That would require a signing practice based on recipient which is functionality is not available in dkim-milter (but which possibly could be implemented as Postfix configuration although I don’t know how). Implementing this would not necessarily require knowing all the lists that mail might be sent to, it seems that a large portion of the world’s list traffic is sent to addresses that match the string “@lists.” which can be easily recognised. For a service such as Gmail it would be easy to recognise list traffic from the headers of received messages and then treat messages sent to those domains differently.

As the signature status is based on the sending address it would be possible for me to use different addresses for sending to mailing lists to avoid the signatures (the number of email addresses in use in my domain is small enough that having a line for each address to sign will not be a great inconvenience). Most MUAs have some sort of functionality to automatically choose a sender address that is in some way based on the recipient address. I’ll probably do this eventually, but for the moment I’ll just use the BodyLengths option, while it does reduce the security a bit it’s still a lot better than having no DKIM checks.