DKIM and Mailing Lists

Currently we have a problem with the Debian list server and Gmail. Gmail signs all mail that it sends with both DKIM and DomainKeys (DomainKeys has been obsoleted by DKIM so most mail servers implement only one of the two standards although apart from space there is no reason not to use both). The Debian list servers change the message body without removing the signatures, and therefore send out mail with invalid signatures.

DKIM has an option to specify the length of the body part that it signs. If that option is used then an intermediate system can append data to the body without breaking the signature. This could be bad if a hostile party could intercept messages and append something damaging, but has the advantage that mailing list footers will not affect the signature. Of course if the list server modifies the Subject to include the list name in brackets at the start of the subject line then it will still break the signature. However Gmail is configured to not use the length field, and a Gmail user has no option to change this (AFAIK – if anyone knows how to make Gmail use the DKIM length field for their own account then please let me know).

I believe that the ideal functionality of a sending mail server would be to have the configuration of it’s DKIM milter allow specifying which addresses should have the length field used. For example I would like to have all mail sent to an address matching @lists\. have the length field used (as well as some other list servers that don’t match that naming scheme), and I would also like to be able to specify which recipient addresses should have no DKIM signatures (for example list servers that modify the subject line). I have filed Debian bug #500967 against the dkim-filter package requesting this feature [1].

For correct operation of a list server, the minimal functionality is implemented in the Mailman package in Lenny. That is to strip off DKIM and DomainKey signatures. The ideal functionality of a list server would be that for lists that are not configured to modify the Subject line it would leave a DKIM header that uses the length field and otherwise remove the DKIM header. I have filed Debian bug #500965 against lists.debian.org requesting that the configuration be changed to strip the headers in question in a similar manner [2] (the Debian list servers appear to use SmartList – I have not checked whether the latest version of SmartList does the right thing in this regard – if not it deserves another bug report).

I have also filed Debian bug report #500966 requesting that the list servers sign all outbound mail with DKIM [3]. I believe that protecting the integrity of the Debian mail infrastructure is important, preventing forged mail is a good thing, and that the small amount of CPU time needed for this is worth the effort.

Also the Debian project is in a position of leadership in the community. We should adopt new technologies that benefit society first to help encourage others to do the same and also to help find and fix bugs.

How Many Singularities?

There is a lot of discussion and speculation about The Singularity. The term seems to be defined by Ray Kurzweil’s book “The Singularity Is Near” [1] which focuses on a near-future technological singularity defined by significant increases in medical science (life extension and methods to increase mental capacity) and an accelerating rate of scientific advance.

In popular culture the idea that there will only be one singularity seems to be well accepted, so the discussion is based on when it will happen. One of the definitions for a singularity is that it is a set of events that change society in significant ways such that predictions are impossible – based on the concept of the Gravitational Singularity (black hole) [2]. Science fiction abounds with stories about what happens after someone enters a black hole, so the concept of a singularity not being a single event (sic) is not unknown, but it seems to me that based on our knowledge of science no-one considers there to be a black hole with multiple singularities – not even when confusing the event horizon with the singularity.

If we consider a singularity to merely consist of a significant technological change (or set of changes) that change society in ways that could not have been predicted (not merely changes that were not predicted) then it seems that there have been several already, here are the ones that seem to be likely candidates:

0) The development of speech was a significant change for our species (and a significant change OF our species). Maybe we should consider that to be singularity 0 as hominids that can’t speak probably can’t be considered human.

1) The adoption of significant tool use and training children in making and using tools (as opposed to just letting them learn by observation) made a significant change to human society. I don’t think that with the knowledge available to bands of humans without tools it would have been possible to imagine that making stone axes and spears would enable them to dominate the environment and immediately become the top of the food chain. In fact as pre-tool hominids were generally not near the top of the food chain they probably would have had difficulty imagining being rulers of the world. I’m sure that it led to an immediate arms race too.

2) The development of agriculture was a significant change to society that seems to have greatly exceeded the expectations that anyone could have had at the time. I’m sure that people started farming as merely a way of ensuring that the next time they migrated to an area there was food available (just sowing seeds along traditional migration routes for a hunter-gatherer existence). They could not have expected that the result would be a significant increase in the ability to support children and a significant increase in the number of people who could be sustained by a given land area, massive population growth, new political structures to deal with greater population density, and then wiping out hunter-gatherer societies in surrounding regions. It seems likely to me that the mental processes needed to predict the actions of a domestic animal (in terms of making it a friend, worker, or docile source of food) differ from those needed to predict the actions of other humans (who’s mental processes are similar) and from those needed to predict the actions of prey that is being hunted (you only need to understand enough to kill it).

3) The invention of writing allowed the creation of larger empires through better administration. All manner of scientific and political development was permitted by writing.

4) The work of Louis Pasteur sparked a significant development in biology which led to much greater medical technology [3]. This permitted much greater population densities (both in cities and in armies) without the the limitation of significant disease problems. It seems that among other things the world-wars depended on developments in preventing disease which were linked to Louis’ work. Large populations densely congregated in urban areas permit larger universities and a better exchange of knowledge which permitted further significant developments in technology. It seems unlikely that a population suffering the health problems that were common in 1850 could have simultaneously supported large-scale industrial warfare and major research projects such as the Manhattan Project.

5) The latest significant change in society has been the development of the Internet and mobile phones. Mobile phones were fairly obvious in concept, but have made structural changes to society. For example I doubt that hand-writing is going to be needed to any great extent in the future [4], the traditional letter has disappeared, and “Dates” are now based on “I’ll call your mobile when I’m in the area” instead of meeting at a precise time – but this is the trivial stuff. Scientific development and education have dramatically increased due to using the Internet and business now moves a lot faster due to mobile phones. It seems that nowadays any young person who doesn’t want to be single and unemployed needs to have either a mobile phone or Internet access – and preferably both. When mobile phones were first released I never expected that almost everyone would feel compelled to have one, and when I first started using the Internet in 1992 I never expected it to have the rich collaborative environment of Wikipedia, blogging, social networking, etc (I didn’t imagine anything much more advanced than file exchange and email).

Of these changes the latest (Internet and mobile phones) seems at first glance to be the least significant – but let’s not forget that it’s still an ongoing process. The other changes became standard parts of society long ago. So it seems that we could count as many as six singularities, but it seems that even the most conservative count would have three singularities (tool use, agriculture, and writing).

It seems to me that the major factors for a singularity are an increased population density (through couples being able to support more children, through medical technology extending the life expectancy, through greater food supplies permitting more people to live in an area, or through social structures which manage the disputes that arise when there is a great population density) and increased mental abilities (which includes better education and communication). Research into education methods is continuing, so even without genetically modified humans, surgically connecting computers to human brains, or AI we can expect intelligent beings with a significant incremental advance over current humans in the near future. Communications technology is continually being improved, with some significant advances in the user-interfaces. Even if we don’t get surgically attached communications devices giving something similar to “telepathy” (which is not far from current technology), there are possibilities for significant increments in communication ability through 3D video-conferencing, better time management of communication (inappropriate instant communication destroys productivity), and increased communication skills (they really should replace some of the time-filler subjects at high-school with something useful like how to write effective diagrams).

It seems to me that going from the current situation of something significantly less than one billion people with current (poor) education and limited communications access (which most people don’t know how to use properly) to six billion people with devices that are more user-friendly and powerful than today’s computers and mobile phones combined with better education as to how to use them has the potential to increase the overall rate of scientific development by more than an order of magnitude. This in itself might comprise a singularity depending on the criteria you use to assess it. Of course that would take at least a generation to implement, a significant advance in medical technology or AI could bring about a singularity much sooner.

But I feel safe in predicting that people who expect the world to remain as it is forever will be proven wrong yet again, and I also feel safe in predicting that most of them will still be alive to see it.

I believe that we will have a technological singularity (which will be nothing like the “rapture” which was invented by some of the most imaginative interpretations of the bible). I don’t believe that it will be the final singularity unless we happen to make our species extinct (in which case there will most likely be another species to take over the Earth and have it’s own singularities).

Could we have an Open Computing Cloud?

One of the most interesting new technologies that has come out recently is Cloud Computing, the most popular instance seems to be the Amazon EC2 (Elastic Cloud Computing). I think it would be good if there were some open alternatives to EC2.

Amazon charges $0.10 per compute hour for a virtual machine that has one Compute Unit (equivalent to a 1.0 to 1.2GHz 2007 Opteron core) and 1.7G of RAM. Competing with this will be difficult as it’s difficult to be cheaper than 10 cents an hour ($876.60 per annum) to a sufficient extent to compensate for the great bandwidth that Amazon has on offer.

The first alternative that seems obvious is a cooperative model. In the past I’ve run servers for the use of friends in the free software community. It would be easy for me to do such things in future, and Xen makes this a lot easier than it used to be. If anyone wants a DomU for testing something related to Debian SE Linux then I can set one up in a small amount of time. If there was free software to manage such things then it wuld be practical to have some sort of share system for community members.

The next possibility is a commercial model. If I could get Xen to provide a single Amazon Compute Unit to one DomU (not less or more) then I wouldn’t notice it on some of my Xen servers. 1.7G of RAM is a moderate amount, but as 3G seems to be typical for new desktop systems (Intel is still making chipsets that support a maximum of 4G of address space [2], when you subtract the address space for video and PCI you might as well only get 3G) it would not be inconceivable to use 1.7G DomUs on idle desktop machines. But it’s probably more practical to have a model with less RAM. For my own use I run a number of DomUs with 256M of RAM for testing and development and the largest server DomU I run is 400M (that is for ClamAV, SpamAssassin, and WordPress). While providing 1.7G of RAM and 1CU for less than 10 cents an hour may be difficult, but providing an option of 256M of RAM and 0.2CU (burstable to 0.5CU) for 2 cents an hour would give the same aggregate revenue for the hardware while also offering a cheaper service for people who want that. 2 cents an hour is more than the cost of some of the Xen server plans that ISPs offer [3] but if you only need a server for part of the time then it would have the potential to save some money.

For storage Amazon has some serious bandwidth inside it’s own network for transferring the image to the machine for booting. To do things on the cheap the way to go would be to create a binary diff of a common image. If everyone who ran virtual servers had images of the common configurations of the popular distributions then creating an image to boot would only require sending a diff (maybe something based on XDelta [4]). Transferring 1GB of filesystem image over most network links is going to be unreasonably time consuming, transferring a binary diff of an up to date CentOS or Debian install vs a usable system image based on CentOS or Debian which has all the updates applied is going to be much faster.

Of course something like this would not be suitable for anything that requires security. But there are many uses for servers that don’t require much security.

Things you can do for your LUG

A Linux Users Group like most volunteer organisations will often have a small portion of the membership making most of the contributions. I believe that every LUG has many people who would like to contribute but don’t know how, here are some suggestions for what you can do.

Firstly offer talks. Many people seem to believe that giving a talk for a LUG requires expert knowledge. While it is desired to get any experts in the area to share their knowledge, it is definitely not a requirement that you be an expert to give a talk. The only requirement is that you know more than the audience – and a small amount of research can achieve that goal.

One popular talk that is often given is “what’s new in Linux”. This is not a talk that requires deep knowledge, it does require spending some time reading the news (which lots of people do for fun anyway). So if you spend an average of 30 minutes a day every week day reading about new developments in Linux and other new technology, you could spend another minute a day (20 minutes a month) making notes and the result would be a 10 to 15 minute talk that would be well received. A talk about what’s new is one way that a novice can give a presentation that will get the attention of all the experts (who know their own area well but often don’t have time to see the big picture).

There are many aspects of Linux that are subtle, tricky, and widely misunderstood. Often mastering them is a matter that is more related to spending time testing than anything else. An example of this is the chmod command (and all the Unix permissions that are associated with it). I believe that the majority of Linux users don’t understand all the subtleties of Unix permissions (I have even seen an employee of a Linux vendor make an error in this regard while running a formal training session). A newbie who spent a few hours trying the various combinations of chmod etc and spoke about the results could give a talk that would teach something to almost everyone in the audience. I believe that there are many other potential talk topics of this nature.

One thing that is often overlooked when considering how to contribute to LUGs is the possibility of sharing hardware. We all have all the software we need for free but hardware still costs money. If you have some hardware that hasn’t been used for a year then consider whether you will ever use it again, if it’s not likely to be used then offer it to your LUG (either via a mailing list or by just bringing it to a meeting). Also if you see some hardware that is about to be discarded and you think that someone in your LUG will like it then grab it! In a typical year I give away a couple of car-loads of second-hand hardware, most of it was about to be thrown out by a client so I grab it for my local LUG. Taking such hardware reduces disposal costs for my clients, prevents computer gear from poisoning landfill (you’re not supposed to put it in the garbage but most people do), and helps random people who need hardware.

One common use for the hardware I give away is for children. Most people are hesitant to buy hardware specifically for children as it only takes one incident of playing with the switch labeled 240V/110V (or something of a similar nature) to destroy it. Free hardware allows children to get more access to computers at an early age.

Finally one way to contribute is by joining the committee. Many people find it difficult to attend meetings, so attending a regular meeting and a committee meeting every month is difficult. So if you have no problems in attending meetings then please consider contributing in this way.

An Update on DKIM Signing and SE Linux Policy

In my previous post about DKIM [1] I forgot to mention one critical item, how to get Postfix to actually talk to the DKIM milter. This wasn’t a bad thing because it turned out that I hadn’t got it right.

I had configured the DKIM milter on the same line as the milters for ClamAV and SpamAssassin – in the smtpd_milters section. This was fine for relaying outbound mail via my server but didn’t work for locally generated mail. For locally generated mail Postfix has a directive named non_smtpd_milters which you need to use. So it seems that a fully functional Postfix DKIM milter configuration requires adding the following two lines to /etc/postfix/main.cf:

smtpd_milters = unix:/var/run/dkim-filter/dkim-filter.sock
non_smtpd_milters = unix:/var/run/dkim-filter/dkim-filter.sock

This also required an update to the SE Linux policy. When I was working on setting up DKIM I also wrote SE Linux policy to allow it and also wrote policy for the ClamAV milter. That policy is now in Debian/Unstable and has been approved for Lenny. So I now need to build a new policy package that allows the non_smtpd_milter access to the DKIM milter and apply for it to be included in Lenny.

SE Linux in Lenny is going to be really good. I think that I’ve already made the SE Linux support in the pre-release (*) of Lenny significantly better than Etch plus all my extra updates is. More testers would be appreciated, and more people joining the coding would be appreciated even more.

(*) I use the term pre-release to refer to the fact that the Lenny repository is available for anyone to download packages.

Installing DKIM and Postfix in Debian

I have just installed Domain Key Identified Mail (DKIM) [1] on my mail server. In summary the purpose is to allow public-key signing of all mail that goes out from your domain so that the recipient can verify it’s authenticity (and optionally reject forgeries). It also means that you can verify inbound mail. A weakness of DKIM is that it is based on the DNS system (which has many issues and will continue to have them until DNSSEC becomes widely adopted). But it’s better than nothing and it’s not overly difficult to install.

The first thing to do before installing DKIM is to get a Gmail account. Gmail gives free accounts and does DKIM checks. If you use Iceweasel or another well supported browser then you can click on “Show Details” from the message view which then gives fields “mailed-by” and “signed-by” which indicate the DKIM status. If you use a less supported browser such as Konqueror then you have to click on the “Show Original” link to see the headers and inspect the DKIM status there (you want to see dkim=pass in the Authentication-Results header). Also Gmail signs outbound mail so it can be used to test verification of received mail.

The next thing to do is to install the DKIM Milter. To make things exciting this is packaged for Debian under the name dkim-filter so that reasonable searches for such functionality (such as a search for milter or dkim-milter – the upstream project name) will fail.

After installing the package you must generate a key, I used the command “dkim-genkey -d coker.com.au -s 2008” to generate a key for my domain. It seems that the domain is currently only used as a comment but I prefer to use all reasonable parameters for such things. The -s option is for a selector, which is a way of specifying multiple valid signing keys. It’s apparently fairly common to use a different key every year. But other options include having multiple mail servers for a domain and giving each one a selector. The dkim-genkey command produces two files, one is named 2008.txt and can be copied into a BIND zone file. The other is named 2008.private and is used by the DKIM signing server.

Here is a sample of the most relevant parts of the config file /etc/dkim-filter.conf for signing mail for a single domain:

Domain coker.com.au
KeyFile /etc/dkim/2008.private
Selector 2008

The file /etc/default/dkim-filter needs to be modified to specify how it will listen for connections from the MTA, I uncommented the line SOCKET=”local:/var/run/dkim-filter/dkim-filter.sock”.

One issue is that the Unix domain socket file will by default not be accessible to Postfix, I devised a work-around for this and documented it in Debian bug report #499364 [2] (I’ve hacked a chgrp command into the init script, ideally the GID would be an option in a config file).

A basic configuration of dkim-milter will sign mail for one domain. If you want to sign mail for more than one domain you have to comment out the configuration for a single domain in /etc/dkim-filter.conf and instead use the option KeyList file to specify a file with a list of domains (the dkim-filter.conf(5) man page documents this). The one confusing issue is that the selector is taken to be the basename of the file which contains the secret key (they really should have added an extra field). This means that if you have an obvious naming scheme for selectors (such as the current year) then you need a different directory for each domain to contain the key.

As an example here is the line from the KeyList file for my domain:
*@coker.com.au:coker.com.au:/etc/dkim/coker.com.au/2008

Now one problem that we have is that list servers will usually append text to the body of a message and thus break the signature. The correct way of solving this is to have the list server sign the mail it sends out and have a header indicating the signature status of the original message. But there are a lot of list servers that won’t be updated for a long time.

The work-around is to put the following line in /etc/dkim-filter.conf:
BodyLengths yes

This means that the signature will cover a specified number of bytes of body data, and any extra which is appended will be ignored when it comes time to verify the message. This means of course that a hostile third party could append some bogus data without breaking the signature. In the case of plain text this isn’t so bad, but when the recipient defaults to having HTML email it could have some interesting possibilities. I wonder whether it would be prudent to configure my MUA to always send both HTML and plain-text versions of my mail so that an attacker can’t append hostile HTML.

It’s a pity that Gmail (which appears to have the most popular implementation of DKIM) doesn’t allow setting that option. So far the only message I have received that failed DKIM checks was sent from a Gmail account to a Debian mailing list.

Ideally it would be possible to have the messages sent to mailing lists not be signed or have the length field used. That would require a signing practice based on recipient which is functionality is not available in dkim-milter (but which possibly could be implemented as Postfix configuration although I don’t know how). Implementing this would not necessarily require knowing all the lists that mail might be sent to, it seems that a large portion of the world’s list traffic is sent to addresses that match the string “@lists.” which can be easily recognised. For a service such as Gmail it would be easy to recognise list traffic from the headers of received messages and then treat messages sent to those domains differently.

As the signature status is based on the sending address it would be possible for me to use different addresses for sending to mailing lists to avoid the signatures (the number of email addresses in use in my domain is small enough that having a line for each address to sign will not be a great inconvenience). Most MUAs have some sort of functionality to automatically choose a sender address that is in some way based on the recipient address. I’ll probably do this eventually, but for the moment I’ll just use the BodyLengths option, while it does reduce the security a bit it’s still a lot better than having no DKIM checks.

Software has No Intrinsic Value

In a comment on my Not All Opinions Are Equal [1] post AlphaG said “Anonymous comments = free software, no intrinsic value as you got it for nothing”.

After considering the matter I came to the conclusion that almost all software has no intrinsic value (unless you count not being sued for copyright infringement as intrinsic value). When you buy software you generally don’t get a physical item (maybe a CD or DVD), to increase profit margins manuals aren’t printed for most software (it used to be that hefty manuals were shipped to give an impression that you were buying a physical object). Software usually can’t be resold (both due to EULA provisions and sites such as EBay not wanting to accept software for sale) and recently MS has introduced technical measures that prevent even using it on a different computer (which force legitimate customers to purchase more copies of Windows when they buy new hardware but doesn’t stop pirates from using it without paying). Even when software could be legally resold there were always new versions coming out which reduced the sale price to almost zero in a small amount of time.

The difference between free software and proprietary software in terms of value is that when you pay for free software you are paying for support. This therefore compels the vendor to provide good support that is worth the money. Vendors of proprietary software have no incentive to provide good support – at least not unless they are getting paid a significant amount of money on top of the license fees. This is why Red Hat keeps winning in the CIO Vendor Value Studies from CIO Insight [2]. Providing value is essential to the revenue of Red Hat, they need to provide enough value in RHEL support that customers will forgo the opportunity to use CentOS for free.

Thinking of software as having intrinsic value leads to the error of thinking of software purchases as investments. Software is usually outdated in a few years, as is the hardware that is used to run it. Money spent on software and hardware should be considered as being a tax on doing business. This doesn’t mean that purchases should be reduced to the absolute minimum (if systems run slowly they directly decrease productivity and also cause a loss of morale). But it does mean that hardware purchases should not be considered as investments – the hardware will at best be on sale cheap at an auction site in 3-5 years, and purchases of proprietary software are nothing but a tax.

Laptop Computer Features

It’s not easy to choose a laptop, and part of the problem is that most people don’t seem to start from the use of the laptop. I believe that the following four categories cover the vast majority of the modern use of mobile computers.

  1. PDA [1] – can be held in one hand and generally uses a touch-screen. They generally do not resemble a laptop in shape or design.
  2. Subnotebook [2] AKA Netbook [3] – a very small laptop that is designed to be very portable. Typically weighing 1KG or less and having multiple design trade-offs to give light weight (such as a small screen and a slow CPU) while still being more powerful than a PDA. The EeePC [4] is a well known example.
  3. Laptop [5] – now apparently defined to mean a medium size portable computer, light enough to be carried around but with a big enough screen and keyboard that many people will be happy to use them for 8 hours a day. The word is also used to mean all portable computers that could possibly fit on someone’s lap.
  4. Desktop Replacement [6] – a big heavy laptop that is not carried much.

There is some disagreement about the exact number of categories and which category is most appropriate for each machine. There is a range of machines between the Subnotebook and Laptop categories. There is some amount of personal preference involved in determining which category a machine might fall in. For example I find a Thinkpad T series to fit into the “Laptop” category (and I expect that most people would agree with me). But when comparing the weight and height of an average 10yo child to an adult it seems that a 10yo would find an EeePC to be as much effort to carry as a T series Thinkpad is for an adult.

It seems to me that the first thing that you need to do when choosing a laptop is to decide which of the above categories is most appropriate. While the boundaries between the categories are blurry and to some extent are limited by personal preference it’s an easy second step to determine which machines fit the category you have selected (in your opinion) once you have made a firm decision on the category. It’s also possible to choose a half-way point, for example if you wanted something on the border of the “Laptop” and NetBook categories then a Thinkpad X series might do the job.

The next step of course is to determine which OSs and applications you want to run. There are some situations where the choice of OS and/or applications may force you to choose a category that has more powerful hardware (a CPU with more speed or features, more RAM, or more storage). For example a PDA generally won’t run a regular OS well (if at all) due to the limited options available for input devices and the very limited screen resolution. Even a NetBook has limitations as to what software runs well (for example many applications require a minimum resolution of 800×600 and don’t work well on an EeePC 701). Also Xen can not be used on the low-end CPUs used in some NetBooks which lack PAE.

Once you have chosen a category you have to look for features which make sense for that category. A major criteria for a PDA is how fast you can turn it on, it should be possible to go from standby to full use in less than one second. Another major criteria is how long the battery lasts, it should compare to a mobile phone (several days on standby and 8 hours of active use). A criteria that is important to some people is the ability to use both portrait and landscape views for different actions (I use portrait for editing and landscape for reading).

A NetBook is likely to be used in many places and needs to have a screen that will work well in adverse lighting conditions (a shiny reflective screen is a really bad thing), it also needs to be reasonably resilient as it is going to get bumped if it is transported a lot (a solid state disk is a good feature). It should also be as light as possible while having enough hardware to run a regular OS (an EeePC 701 with 512M of RAM and 4G of storage is about the minimum hardware for running a regular distribution of Linux).

A desktop replacement needs to have all the features, lots of RAM, a fast CPU and video hardware, and a big screen – it also needs a good keyboard (test by typing your name several times). The “Laptop” category is much the same as the desktop replacement, but a bit smaller, a lot lighter, and better battery life.

It seems very difficult to give any specific advice as to which laptop to buy when the person who wants the advice has not chosen a category (which is often the case).

Not All Opinions Are Equal

It seems to be a common idea by non-bloggers that the comment they enter on a blog is somehow special and should be taken seriously by the author of the blog (everyone is a legend in their own mind). In a recent discussion one anonymous commentator seemed offended that I didn’t take his comments seriously and didn’t understand why I would take little notice of an anonymous comment while taking note of a later comment on the same issue by the author of the project in question.

In most forums (and I use the term in the broadest way) an anonymous comment is taken with a weight that is close to zero. That doesn’t mean that it will be ignored, it just means that the requirement for providing supporting evidence or of having a special insight and explaining it is much greater.

One example of this is the comment weighting system used by Slashdot.org (AKA “/.”). The /. FAQ has a question “Why should I log in?” with the answer including “Posting in Discussions at Score:1 instead of Score:0 means twice as many people will see your comments” [1]. /. uses the term “Anonymous Coward” as the identification of users who are not logged in, this gives an idea of how they are regarded.

Advogato uses a rating method for blog posts which shows you only posts from blogs that you directly rank well or which match the trust metric (based on rankings of people you rank) [2].

I believe that the automated systems developed by /. and other online forums emulate to a some extent the practices that occur off-line. For any discussion in a public place a comment from someone who does not introduce themself (or gives an introduction that gives no reason to expect quality) will be treated with much less weight than one from someone who is known. When someone makes a comment their background will be considered by people who hear it. If a comment is entirely a matter of opinion and can not be substantiated by facts and logical analysis then the acceptance of the comment is solely based on the background of the author (and little things like spelling errors can count against the author).

Therefore if you want your blog comments to be considered by blog authors and readers you need to make sure that you are known. Using your full name is one way of not being as anonymous but most names are not unique on the Internet (I’ve previously described some ways of ensuring that you beat other people with the same name in Google rankings [3]). The person who owns the blog can use the email address that is associated with the comment to identify the author (if it’s a real email address and it’s known by Google). But for other readers the only option is the “Website” field. The most common practice is to use the “Website” field in the comment to store the URL of your blog (most blog comments are written by bloggers). But there is nothing stopping you from using any other URL, if you are not a blogger and want to write comments on blogs you could create a personal web page to use for the comments. If the web page you use for such purposes gives links to references as to your relevant experience then that would help. Someone who has skills in several areas could create a web page for each one and reference the appropriate page in their comment.

One problem we face is that it is very easy to lie on the net. There is no technical obstacle to impersonation on the net, while I haven’t seen any evidence of people impersonating others in an attempt to add credibility to blog comments I expect it’s only a matter of time before that happens (I expect that people do it already but the evidence of them getting caught has not been published anywhere that I’ve read). People often claim university education to add weight to their comments (usually in email but sometimes in blog comments too). One problem with this is that anyone could falsely claim to have a university degree and no-one could disprove their claim without unreasonable effort, another is that a university degree actually doesn’t mean much (lots of people remain stupid after graduating). One way in which adding a URL to a comment adds weight is that for a small web site the author will check a reasonable portion of the sites that link to them, so if someone impersonates me and has a link to my web site in the comment then there’s a good chance that I will notice this.

OpenID [4] has the potential to alleviate this by making it more difficult to forge an association with a web site. One thing that I am working on is enabling OpenID on all the web sites that are directly associated with me. I plan to use a hardware device to authenticate myself with the OpenID server (so I can securely enter blog comments from any location). I expect that it will become the standard practice that comments will not be accepted by most blogs if they are associated with a URL that is OpenID enabled unless the author of the comment authenticates themself via OpenID.

Even when we get OpenID enabled everywhere there is still the issue of domain specific expertise. While I am well enough known for my work on SE Linux that most people will accept comments about it simply because I wrote them, the same can not be said for most topics that I write about. When writing about topics where I am not likely to be accepted as an expert I try and substantiate my main points with external web pages. Comments are likely to be regarded as spam if they have too many links so it seems best to only use one link per comment – which therefore has to be on an issue that is important to the conclusion and which might be doubted if evidence was not provided. The other thing that is needed is a reasonable chain of deduction. Simply stating your opinion means little, listing a series of logical steps that led you to the opinion and are based on provable facts will hold more weight.

These issues are not only restricted to blog comments, I believe that they apply (to differing degrees) to all areas of online discussion.

Execmod and SE Linux – i386 Must Die

I have previously written about the execmod permission check in SE Linux [1] and in a post about SE Linux on the desktop I linked to some bug reports about it [2] (which probably won’t be fixed in Debian).

One thing I didn’t mention is the proof of the implication of this. When running a program in the unconfined_t domain on a SE Linux system (the domain for login sessions on a default configuration), if you set the boolean allow_execmod then the following four tests from paxtest will be listed as vulnerable:

Executable bss (mprotect)
Executable data (mprotect)
Executable shared library bss (mprotect)
Executable shared library data (mprotect)

This means that if you have a single shared object which uses text relocations and therefore needs the execmod permission then the range of possible vectors for attack against bugs in the application has just increased by four. This doesn’t necessarily require that the library in question is actually being used either! If a program is linked against many shared objects that it might use, then even if it is not going to ever use the library in question it will still need execmod access to start and thus give extra possibilities to the attacker.

For reference when comparing a Debian system that doesn’t run SE Linux (or has SE Linux in permissive mode) to a SE Linux system with execmod enabled the following tests fail (are reported as vulnerable):

Executable anonymous mapping (mprotect)
Executable heap (mprotect)
Executable stack (mprotect)
Writable text segments

If you set the booleans allow_execstack and allow_execheap then you lose those protections. But if you use the default settings of all three booleans then a program running in the unconfined_t domain will be protected against 8 different memory based attacks.

Based on discussions with other programmers I get the impression that fixing all the execmod issues on i386 is not going to be possible. The desire for a 15% performance boost (the expected result of using an extra register) is greater than the desire for secure systems among the people who matter most (the developers).

Of course we could solve some of these issues by using statically linked programs and have statically linked versions of the libraries in question which can use the extra register without any issues. This does of course mean that updates to the libraries (including security updates) will require rebuilding the statically linked applications in question – if a rebuild was missed then this could be reduce the security of the system.

To totally resolve that issue we need to have i386 machines (the cause of the problem due to their lack of registers) go away. Fortunately in the mainstream server, desktop, and laptop markets that is already pretty much done. I’m still running a bunch of P3 servers (and I know many other people who have similar servers), but they are not used for tasks that involve running programs that are partially written in assembly code (mplayer etc).

One problem is that there are still new machines being released with the i386 ISA as the only instruction set. For example the AMD Geode CPU [2] is used by the One Laptop Per Child (OLPC) project [3] and the new Intel Atom line of CPUs [4] apparently only supports the AMD64 ISA on the “desktop” models and the versions used in ultra-mobile PCs are i386 only.

I think that these issues are particularly difficult in regard to the OLPC. It’s usually not difficult to run “yum upgrade” or “apt-get dist-upgrade” on an EeePC or similar ultra-mobile PC. But getting an OLPC machine upgraded in some of the remote places where they may be deployed might be more difficult. Past experience has shown that viruses and trojans can get to machines that are supposed to be on isolated networks, so it seems that malware can get access to machines that can not communicate with servers that contain security patches… One mitigating factor however is that the OLPC OS is based on Fedora, and Fedora seems to be taking the strongest efforts to improve security of any mainstream distribution, a choice between 15% performance and security seems to be a no-brainer for the Fedora developers.