Archives

Categories

RPC and SE Linux

One ongoing problem with TCP networking is the combination of RPC services and port based services on the same host. If you have an RPC service that uses a port less than 1024 then typically it will start at 1023 and try lower ports until it finds one that works. A problem that I have had in the past is that an RPC service used port 631 and I then couldn’t start CUPS (which uses that port). A similar problem can arise in a more insidious manner if you have strange networking devices such as a BMC [1] which uses the same IP address as the host and just snarfs connections for itself (as documented by pantz.org [2]), this means that according to the OS the port in question is not in use, but connections to that port will go to the hardware BMC and the OS won’t see them.

Another solution is to give a SE Linux security context to the port which prevents the RPC service from binding to it. RPC applications seem to be happy to make as many bind attempts as necessary to get an available port (thousands of attempts if necessary) so reserving a few ports is not going to cause any problems. As far as I recall my problems with CUPS and RPC services was a motivating factor in some of my early work on writing SE Linux policy to restrict port access.

Of course the best thing to do is to assign IP addresses for IPMI that are different from the OS IP addresses. This is easy to do and merely requires an extra IP address for each port. As a typical server will have two Ethernet ports on the baseboard (one for the front-end network and one for the private network) that means an extra two IP addresses (you want to use both interfaces for redundancy in case the problem which cripples a server is related to one of the Ethernet ports). But for people who don’t have spare IP addresses, SE Linux port labeling could really help.

Getting Started with Amazon EC2

The first thing you need to do to get started using the Amazon Elastic Compute Cloud (EC2) [1] is to install the tools to manage the service. The service is run in a client-server manner. You install the client software on your PC to manage the EC2 services that you use.

There are the AMI tools to manage the machine images [2] and the API tools to launch and manage instances [3].

The AMI tools come as both a ZIP file and an RPM package and contain Ruby code, while the API tools are written in Java and only come as a ZIP file.

There are no clear license documents that I have seen for any of the software in question, I recall seeing one mention on one of the many confusing web pages of the code being “proprietary” but nothing else. While it seems most likely (but far from certain) that Amazon owns the copyright to the code in question, there is no information on how the software may be used – apart from an implied license that if you are a paying EC2 customer then you can use the tools (as there is no other way to use EC2). If anyone can find a proper license agreement for this software then please let me know.

To get software working in the most desirable manner it needs to be packaged for the distribution on which it is going to be used, as I prefer to use Debian that means packaging it for Debian. Also when packaging the software you can fix some of the silly things that get included in software that is designed for non-packaged release (such as demanding that environment variables be set to specify where the software is installed). So I have built packages for Debian/Lenny for the benefit of myself and some friends and colleagues who use Debian and EC2.

As I can’t be sure of what Amazon would permit me to do with their code I have to assume that they don’t want me to publish Debian packages for the benefit of all Debian and Ubuntu users who are (or might become) EC2 customers. So instead I have published the .diff.gz files from my Debian/Lenny packages [4] to allow other people to build identical packages after downloading the source from Amazon. At the moment the packages are a little rough, and as I haven’t actually got an EC2 service running with them yet they may have some really bad bugs. But getting the software to basically work took more time than expected. So even if there happen to be some bugs that make it unusable in it’s current state (the code for determining where it looks for PEM files at best needs a feature enhancement and at worst may be broken at the moment) then it would still save people some time to use my packages and fix whatever needs fixing.

DKIM and Mailing Lists

Currently we have a problem with the Debian list server and Gmail. Gmail signs all mail that it sends with both DKIM and DomainKeys (DomainKeys has been obsoleted by DKIM so most mail servers implement only one of the two standards although apart from space there is no reason not to use both). The Debian list servers change the message body without removing the signatures, and therefore send out mail with invalid signatures.

DKIM has an option to specify the length of the body part that it signs. If that option is used then an intermediate system can append data to the body without breaking the signature. This could be bad if a hostile party could intercept messages and append something damaging, but has the advantage that mailing list footers will not affect the signature. Of course if the list server modifies the Subject to include the list name in brackets at the start of the subject line then it will still break the signature. However Gmail is configured to not use the length field, and a Gmail user has no option to change this (AFAIK – if anyone knows how to make Gmail use the DKIM length field for their own account then please let me know).

I believe that the ideal functionality of a sending mail server would be to have the configuration of it’s DKIM milter allow specifying which addresses should have the length field used. For example I would like to have all mail sent to an address matching @lists\. have the length field used (as well as some other list servers that don’t match that naming scheme), and I would also like to be able to specify which recipient addresses should have no DKIM signatures (for example list servers that modify the subject line). I have filed Debian bug #500967 against the dkim-filter package requesting this feature [1].

For correct operation of a list server, the minimal functionality is implemented in the Mailman package in Lenny. That is to strip off DKIM and DomainKey signatures. The ideal functionality of a list server would be that for lists that are not configured to modify the Subject line it would leave a DKIM header that uses the length field and otherwise remove the DKIM header. I have filed Debian bug #500965 against lists.debian.org requesting that the configuration be changed to strip the headers in question in a similar manner [2] (the Debian list servers appear to use SmartList – I have not checked whether the latest version of SmartList does the right thing in this regard – if not it deserves another bug report).

I have also filed Debian bug report #500966 requesting that the list servers sign all outbound mail with DKIM [3]. I believe that protecting the integrity of the Debian mail infrastructure is important, preventing forged mail is a good thing, and that the small amount of CPU time needed for this is worth the effort.

Also the Debian project is in a position of leadership in the community. We should adopt new technologies that benefit society first to help encourage others to do the same and also to help find and fix bugs.

How Many Singularities?

There is a lot of discussion and speculation about The Singularity. The term seems to be defined by Ray Kurzweil’s book “The Singularity Is Near” [1] which focuses on a near-future technological singularity defined by significant increases in medical science (life extension and methods to increase mental capacity) and an accelerating rate of scientific advance.

In popular culture the idea that there will only be one singularity seems to be well accepted, so the discussion is based on when it will happen. One of the definitions for a singularity is that it is a set of events that change society in significant ways such that predictions are impossible – based on the concept of the Gravitational Singularity (black hole) [2]. Science fiction abounds with stories about what happens after someone enters a black hole, so the concept of a singularity not being a single event (sic) is not unknown, but it seems to me that based on our knowledge of science no-one considers there to be a black hole with multiple singularities – not even when confusing the event horizon with the singularity.

If we consider a singularity to merely consist of a significant technological change (or set of changes) that change society in ways that could not have been predicted (not merely changes that were not predicted) then it seems that there have been several already, here are the ones that seem to be likely candidates:

0) The development of speech was a significant change for our species (and a significant change OF our species). Maybe we should consider that to be singularity 0 as hominids that can’t speak probably can’t be considered human.

1) The adoption of significant tool use and training children in making and using tools (as opposed to just letting them learn by observation) made a significant change to human society. I don’t think that with the knowledge available to bands of humans without tools it would have been possible to imagine that making stone axes and spears would enable them to dominate the environment and immediately become the top of the food chain. In fact as pre-tool hominids were generally not near the top of the food chain they probably would have had difficulty imagining being rulers of the world. I’m sure that it led to an immediate arms race too.

2) The development of agriculture was a significant change to society that seems to have greatly exceeded the expectations that anyone could have had at the time. I’m sure that people started farming as merely a way of ensuring that the next time they migrated to an area there was food available (just sowing seeds along traditional migration routes for a hunter-gatherer existence). They could not have expected that the result would be a significant increase in the ability to support children and a significant increase in the number of people who could be sustained by a given land area, massive population growth, new political structures to deal with greater population density, and then wiping out hunter-gatherer societies in surrounding regions. It seems likely to me that the mental processes needed to predict the actions of a domestic animal (in terms of making it a friend, worker, or docile source of food) differ from those needed to predict the actions of other humans (who’s mental processes are similar) and from those needed to predict the actions of prey that is being hunted (you only need to understand enough to kill it).

3) The invention of writing allowed the creation of larger empires through better administration. All manner of scientific and political development was permitted by writing.

4) The work of Louis Pasteur sparked a significant development in biology which led to much greater medical technology [3]. This permitted much greater population densities (both in cities and in armies) without the the limitation of significant disease problems. It seems that among other things the world-wars depended on developments in preventing disease which were linked to Louis’ work. Large populations densely congregated in urban areas permit larger universities and a better exchange of knowledge which permitted further significant developments in technology. It seems unlikely that a population suffering the health problems that were common in 1850 could have simultaneously supported large-scale industrial warfare and major research projects such as the Manhattan Project.

5) The latest significant change in society has been the development of the Internet and mobile phones. Mobile phones were fairly obvious in concept, but have made structural changes to society. For example I doubt that hand-writing is going to be needed to any great extent in the future [4], the traditional letter has disappeared, and “Dates” are now based on “I’ll call your mobile when I’m in the area” instead of meeting at a precise time – but this is the trivial stuff. Scientific development and education have dramatically increased due to using the Internet and business now moves a lot faster due to mobile phones. It seems that nowadays any young person who doesn’t want to be single and unemployed needs to have either a mobile phone or Internet access – and preferably both. When mobile phones were first released I never expected that almost everyone would feel compelled to have one, and when I first started using the Internet in 1992 I never expected it to have the rich collaborative environment of Wikipedia, blogging, social networking, etc (I didn’t imagine anything much more advanced than file exchange and email).

Of these changes the latest (Internet and mobile phones) seems at first glance to be the least significant – but let’s not forget that it’s still an ongoing process. The other changes became standard parts of society long ago. So it seems that we could count as many as six singularities, but it seems that even the most conservative count would have three singularities (tool use, agriculture, and writing).

It seems to me that the major factors for a singularity are an increased population density (through couples being able to support more children, through medical technology extending the life expectancy, through greater food supplies permitting more people to live in an area, or through social structures which manage the disputes that arise when there is a great population density) and increased mental abilities (which includes better education and communication). Research into education methods is continuing, so even without genetically modified humans, surgically connecting computers to human brains, or AI we can expect intelligent beings with a significant incremental advance over current humans in the near future. Communications technology is continually being improved, with some significant advances in the user-interfaces. Even if we don’t get surgically attached communications devices giving something similar to “telepathy” (which is not far from current technology), there are possibilities for significant increments in communication ability through 3D video-conferencing, better time management of communication (inappropriate instant communication destroys productivity), and increased communication skills (they really should replace some of the time-filler subjects at high-school with something useful like how to write effective diagrams).

It seems to me that going from the current situation of something significantly less than one billion people with current (poor) education and limited communications access (which most people don’t know how to use properly) to six billion people with devices that are more user-friendly and powerful than today’s computers and mobile phones combined with better education as to how to use them has the potential to increase the overall rate of scientific development by more than an order of magnitude. This in itself might comprise a singularity depending on the criteria you use to assess it. Of course that would take at least a generation to implement, a significant advance in medical technology or AI could bring about a singularity much sooner.

But I feel safe in predicting that people who expect the world to remain as it is forever will be proven wrong yet again, and I also feel safe in predicting that most of them will still be alive to see it.

I believe that we will have a technological singularity (which will be nothing like the “rapture” which was invented by some of the most imaginative interpretations of the bible). I don’t believe that it will be the final singularity unless we happen to make our species extinct (in which case there will most likely be another species to take over the Earth and have it’s own singularities).

Solutions for the Housing Crisis

Currently we have a huge husing crisis in the US which involves significant political corruption including the federal government preventing state governments from stopping predatory banking practices [1].

The corrupt plan to solve this is to simply give the banks a lot of taxpayer money, so the banking business model then becomes to do whatever it takes to make a short-term profit and then rely on federal funds for long-term viability. The bank employees who caused the problem by aggressively selling mortgages to people who could never repay them if the housing prices stabilised – let alone if the prices fell.

If the aim is to protect families, then the first requirement is that they not be evicted from their homes. The solution to this is to void the mortgage of anyone resident home owner who purchased a house on the basis of false advice from the bank or who unknowingly entered into a mortgage that any reasonable person who is good at maths can recognise as being impossible for them to repay. The bank would end up with clear title to the property and the ex-homeowner would end up with no debt. Then such properties could be set for a controlled rent for a reasonable period of time (say 5 years). The bank (or it’s creditors) would have the option of renting the property to the ex-mortgagee for a minimum of five years or selling the property to someone else who was willing to do so. Of course the ex-mortgagee (who would not be bankrupt) would have the option of seeking out a new mortgage at reasonable rates and then buying thier home again.

Also to benefit families the rent control period could be extended for as long as they have dependent children.

The losers in this would be the banks and the people who purchased multiple investment properties (the ones who caused all the problems).

Finally what is needed is a cultural shift towards austerity (as described by Juan Enriquez and Jorge Dominguez) [2].

Glen makes an interesting point about the irony of typical homeowners in the US demonstrating more financial literacy than the people who run banks [3].

Could we have an Open Computing Cloud?

One of the most interesting new technologies that has come out recently is Cloud Computing, the most popular instance seems to be the Amazon EC2 (Elastic Cloud Computing). I think it would be good if there were some open alternatives to EC2.

Amazon charges $0.10 per compute hour for a virtual machine that has one Compute Unit (equivalent to a 1.0 to 1.2GHz 2007 Opteron core) and 1.7G of RAM. Competing with this will be difficult as it’s difficult to be cheaper than 10 cents an hour ($876.60 per annum) to a sufficient extent to compensate for the great bandwidth that Amazon has on offer.

The first alternative that seems obvious is a cooperative model. In the past I’ve run servers for the use of friends in the free software community. It would be easy for me to do such things in future, and Xen makes this a lot easier than it used to be. If anyone wants a DomU for testing something related to Debian SE Linux then I can set one up in a small amount of time. If there was free software to manage such things then it wuld be practical to have some sort of share system for community members.

The next possibility is a commercial model. If I could get Xen to provide a single Amazon Compute Unit to one DomU (not less or more) then I wouldn’t notice it on some of my Xen servers. 1.7G of RAM is a moderate amount, but as 3G seems to be typical for new desktop systems (Intel is still making chipsets that support a maximum of 4G of address space [2], when you subtract the address space for video and PCI you might as well only get 3G) it would not be inconceivable to use 1.7G DomUs on idle desktop machines. But it’s probably more practical to have a model with less RAM. For my own use I run a number of DomUs with 256M of RAM for testing and development and the largest server DomU I run is 400M (that is for ClamAV, SpamAssassin, and WordPress). While providing 1.7G of RAM and 1CU for less than 10 cents an hour may be difficult, but providing an option of 256M of RAM and 0.2CU (burstable to 0.5CU) for 2 cents an hour would give the same aggregate revenue for the hardware while also offering a cheaper service for people who want that. 2 cents an hour is more than the cost of some of the Xen server plans that ISPs offer [3] but if you only need a server for part of the time then it would have the potential to save some money.

For storage Amazon has some serious bandwidth inside it’s own network for transferring the image to the machine for booting. To do things on the cheap the way to go would be to create a binary diff of a common image. If everyone who ran virtual servers had images of the common configurations of the popular distributions then creating an image to boot would only require sending a diff (maybe something based on XDelta [4]). Transferring 1GB of filesystem image over most network links is going to be unreasonably time consuming, transferring a binary diff of an up to date CentOS or Debian install vs a usable system image based on CentOS or Debian which has all the updates applied is going to be much faster.

Of course something like this would not be suitable for anything that requires security. But there are many uses for servers that don’t require much security.

Links September 2008

RAM is the new Disk [1] – interesting post about using a distributed network of servers with RAM for main storage. The concept is that RAM on another machine can be accessed faster than local disk and that disk performance for contiguous IO has been increasing steadily at a greater rate than random seek performance. So disks can be used in a similar manner to tape drives (backing up data) and RAM can be used as main storage (2RU machines with 1TB of RAM are predicted soon). It is interesting to contrast this with the Rubik’s Cube solver which uses disk as RAM [2].

Interesting Sun Blog post about how ZFS uses flash based SSD (Solid State Disk) for read caching [3].

Yahoo has released a Firefox plugin named Yslow to tell you why your web site is slow [4]. This complements the Yahoo Best Practices for Speeding Up Your Web Site document [5].

Amsterdam now has cargo trams [6]. This reduces pollution and traffic congestion while allowing efficient transport of cargo. The cargo trams tend to follow passenger trams so they don’t provide any additional obstacle to traffic.

Interesting article by Cory Doctorow about how the standard formats limit creative content [7]. He makes the unfortunate mistake of claiming that “no one tries to make sitcoms about stories that take five minutes to tell” and “no one tries to make feature films about subjects that take 30 seconds to elucidate” – I think that the majority of sitcom episodes can be described in considerably less than 5 minutes and a significant number of movies (EG every one that stars Jean-Claude Van Damme [8]) can have their plot described in less than 30 seconds.

Never Trust a DRM Vendor

I was reading an interesting post about predicting the results of the invasion of Iraq [1]. One of the points made was that the author rejected every statement by a known liar (which includes all the world leaders who wanted the invasion). So basically regarding every statement by a known liar as potentially a lie led to a better than average prediction.

It seems to me that a similar principle can be applied to other areas. A vendor or advocate of DRM (Digital Restrictions Management) [2] has already proven that they are not concerned about the best interests of the people who own the computers in question (who paid for the hardware and software) – they are in fact interested in preventing the owner of the computer from using it in ways that they desire. This focus on security in opposition to the best interests of the user can not result in a piece of software that protects the user’s best interests.

I believe that any software with DRM features (other than those which are requested by people who purchase the software) should be considered to be malware. The best example of this is the Sony BMG copy protection scandal [3].

Note that I refer to the interests of the people who own the computers – not those who use them. It is quite acceptable to write software that serves the buyer not the end-user. Two examples are companies that want to prevent their employees doing “inappropriate” things and parents who want to restrict what their children can do. We can debate about what restrictions on the actions of employees and children are appropriate, but it seems clear that when software is purchased for their use the purchaser has the right to restrict their access.

Also note that when I refer to someone owning a computer I refer to an honest relationship (such as parent/child or employer/worker) not the abusive relationships that mobile phone companies typically have with their customers (give the customer a “free” phone but lock it down and claim that they don’t own it). Mobile phones ARE computers, just rather limited ones at this time.

My Prediction for the iPhone

I have previously written about how I refused an offer of a free iPhone [1] (largely due to it’s closed architecture). The first Google Android phone has just been announced, the TechCrunch review is interesting – while the built-in keyboard is a nice feature the main thing that stands out is the open platform [2]. TechCrunch says “From now on, phones need to be nearly as capable as computers. All others need not apply“.

What I want is a phone that I control, and although most people don’t understand the issues enough to say the same, I think that they will agree in practice.

In the 80’s the Macintosh offered significant benefits over PCs, but utterly lost in the marketplace because it was closed (less available software and less freedom). Due to being used in Macs and similar machines the Motorolla 68000 CPU family also died out, and while it’s being used in games consoles and some other niche markets the PPC CPU family (the next CPU used by Apple) also has an uncertain future. The IBM PC architecture evolved along with it’s CPU from a 16bit system to a 64bit system and took over the market because it does what users want it to do.

I predict that the iPhone will be just as successful as the Macintosh OS and for the same reasons. The Macintosh OS still has a good share of some markets (it has traditionally been well accepted for graphic design and has always provided good hardware and software support for such use), and is by far the most successful closed computer system, but it has a small part of the market.

I predict that the iPhone will maintain only a small share of the market. There will be some very low-end phones that have the extremely closed design that currently dominates the market, and the bulk of the market will end up going with Android or some other open phone platform that allows users to choose how their phone works. One issue that I think will drive user demand for control over their own phones is the safety issues related to child use of phones (I’ve written about this previously [3]). Currently phone companies don’t care about such things – the safety of customers does not affect their profits. But programmable phones allows the potential for improvements to be made without involving the phone company – while with iPhone you have Apple as the roadblock.

Now having a small share of the mobile phone market could be very profitable, just as the small share of the personal computer market is quite profitable for Apple. But it does mean that I can generally ignore them as they aren’t very relevant in the industry.

OpenID Delegation

I’ve just installed Eran Sandler’s OpenID Delegation Plugin [1]. This means that I can now use my blog URL for OpenID authentication. I’ve also included the plugin in my WordPress repository (which among other things has the latest version of WordPress). One thing that I consider to be a bug in Eran’s plugin is the fact that it only adds the OpenID links to the main URL. This means that for example if I write a blog comment and want to refer to one of my own blog posts on the same topic (which is reasonably common – after more than two years of blogging and almost 700 posts I’ve probably written a post that is related to every topic I might want to comment on) then I can’t put the comment in the URL field. The problem here is that URLs in the body of a blog comment generally increase the spam-score (I use this term loosely to refer to a variety of anti-spam measures – I am not aware of anything like SpamAssassin being used on blog comments), and not having OpenID registration also does the same. So it seems that with the current functionality of Eran’s plugin I will potentially suffer in some way any time I want to enter a blog comment that refers to a particular post I wrote.

deb http://www.coker.com.au etch wordpress

My WordPress Debian repository is currently available with the above APT repository. While it specifies etch it works with Lenny too (my blog currently runs on Lenny). I will eventually change it to use lenny in the name.

For the OpenID server I am currently using the OpenID service provided by Yubico as part of the support for their Yubikey authentication token [2] (of which I will write more at a later date). I think that running their own OpenID server was a great idea, it doesn’t cost much to run such a service and it gives customers an immediate way of using their key. I expect that there are more than a few people who would be prepared to buy a Yubikey for the sole purpose of OpenID authentication and signing in to a blog server (which can also be via OpenID if you want to do it that way). I plan to use my Yubikey for logging in to my blog, but I still have to figure out the best way of doing it.

One thing that has been discussed periodically over the years has been the topic of using smart-cards (or some similar devices) for accessing Debian servers and securing access to GPG keys used for Debian work by developers who are traveling. Based on recent events I would hazard a guess that such discussions are happening within the Fedora project and within Red Hat right now (if I worked for Red Hat I would be advocating such things). It seems that when such an idea is adopted a logical extension is to support services that users want such as OpenID at the same time, if nothing else it will make people more prone to use such devices.

Disclaimer: Yubico gave me a free Yubikey for the purpose of review.

Update: The OpenIDEnabled.com tool to test OpenID is useful when implementing such things [3].