Archives

Categories

An Update on DKIM Signing and SE Linux Policy

In my previous post about DKIM [1] I forgot to mention one critical item, how to get Postfix to actually talk to the DKIM milter. This wasn’t a bad thing because it turned out that I hadn’t got it right.

I had configured the DKIM milter on the same line as the milters for ClamAV and SpamAssassin – in the smtpd_milters section. This was fine for relaying outbound mail via my server but didn’t work for locally generated mail. For locally generated mail Postfix has a directive named non_smtpd_milters which you need to use. So it seems that a fully functional Postfix DKIM milter configuration requires adding the following two lines to /etc/postfix/main.cf:

smtpd_milters = unix:/var/run/dkim-filter/dkim-filter.sock
non_smtpd_milters = unix:/var/run/dkim-filter/dkim-filter.sock

This also required an update to the SE Linux policy. When I was working on setting up DKIM I also wrote SE Linux policy to allow it and also wrote policy for the ClamAV milter. That policy is now in Debian/Unstable and has been approved for Lenny. So I now need to build a new policy package that allows the non_smtpd_milter access to the DKIM milter and apply for it to be included in Lenny.

SE Linux in Lenny is going to be really good. I think that I’ve already made the SE Linux support in the pre-release (*) of Lenny significantly better than Etch plus all my extra updates is. More testers would be appreciated, and more people joining the coding would be appreciated even more.

(*) I use the term pre-release to refer to the fact that the Lenny repository is available for anyone to download packages.

Play Machine Update

My Play Machine [1] was offline for most of the past 48 hours (it’s up again now). I have upgraded the hardware for the Dom0 used to run it so that it now has the ability to run more DomU’s. I can now run at least 5 DomUs while previously I could only run 3. I have several plans that involve running multiple Play Machines with different configurations and for running SE Linux training.

The upgrade didn’t need to take two days, but I had some other things that diverted me during the middle of the job (running the Play Machine isn’t my highest priority). I’ve been doing some significant updates to the SE Linux policy for Lenny including some important changes to the policy related to mail servers. Among other things I created a new domain for DKIM (which I previously wrote about) [2]. The chain of dependencies was that a client wanted me to do some urgent DKIM work and I needed my own mail server to be a test-bed. I installed DKIM and then of course I had to write the SE Linux policy. Now that my client’s network is running the way it should be I’ve got a little more time for other SE Linux work.

Installing DKIM and Postfix in Debian

I have just installed Domain Key Identified Mail (DKIM) [1] on my mail server. In summary the purpose is to allow public-key signing of all mail that goes out from your domain so that the recipient can verify it’s authenticity (and optionally reject forgeries). It also means that you can verify inbound mail. A weakness of DKIM is that it is based on the DNS system (which has many issues and will continue to have them until DNSSEC becomes widely adopted). But it’s better than nothing and it’s not overly difficult to install.

The first thing to do before installing DKIM is to get a Gmail account. Gmail gives free accounts and does DKIM checks. If you use Iceweasel or another well supported browser then you can click on “Show Details” from the message view which then gives fields “mailed-by” and “signed-by” which indicate the DKIM status. If you use a less supported browser such as Konqueror then you have to click on the “Show Original” link to see the headers and inspect the DKIM status there (you want to see dkim=pass in the Authentication-Results header). Also Gmail signs outbound mail so it can be used to test verification of received mail.

The next thing to do is to install the DKIM Milter. To make things exciting this is packaged for Debian under the name dkim-filter so that reasonable searches for such functionality (such as a search for milter or dkim-milter – the upstream project name) will fail.

After installing the package you must generate a key, I used the command “dkim-genkey -d coker.com.au -s 2008” to generate a key for my domain. It seems that the domain is currently only used as a comment but I prefer to use all reasonable parameters for such things. The -s option is for a selector, which is a way of specifying multiple valid signing keys. It’s apparently fairly common to use a different key every year. But other options include having multiple mail servers for a domain and giving each one a selector. The dkim-genkey command produces two files, one is named 2008.txt and can be copied into a BIND zone file. The other is named 2008.private and is used by the DKIM signing server.

Here is a sample of the most relevant parts of the config file /etc/dkim-filter.conf for signing mail for a single domain:

Domain coker.com.au
KeyFile /etc/dkim/2008.private
Selector 2008

The file /etc/default/dkim-filter needs to be modified to specify how it will listen for connections from the MTA, I uncommented the line SOCKET=”local:/var/run/dkim-filter/dkim-filter.sock”.

One issue is that the Unix domain socket file will by default not be accessible to Postfix, I devised a work-around for this and documented it in Debian bug report #499364 [2] (I’ve hacked a chgrp command into the init script, ideally the GID would be an option in a config file).

A basic configuration of dkim-milter will sign mail for one domain. If you want to sign mail for more than one domain you have to comment out the configuration for a single domain in /etc/dkim-filter.conf and instead use the option KeyList file to specify a file with a list of domains (the dkim-filter.conf(5) man page documents this). The one confusing issue is that the selector is taken to be the basename of the file which contains the secret key (they really should have added an extra field). This means that if you have an obvious naming scheme for selectors (such as the current year) then you need a different directory for each domain to contain the key.

As an example here is the line from the KeyList file for my domain:
*@coker.com.au:coker.com.au:/etc/dkim/coker.com.au/2008

Now one problem that we have is that list servers will usually append text to the body of a message and thus break the signature. The correct way of solving this is to have the list server sign the mail it sends out and have a header indicating the signature status of the original message. But there are a lot of list servers that won’t be updated for a long time.

The work-around is to put the following line in /etc/dkim-filter.conf:
BodyLengths yes

This means that the signature will cover a specified number of bytes of body data, and any extra which is appended will be ignored when it comes time to verify the message. This means of course that a hostile third party could append some bogus data without breaking the signature. In the case of plain text this isn’t so bad, but when the recipient defaults to having HTML email it could have some interesting possibilities. I wonder whether it would be prudent to configure my MUA to always send both HTML and plain-text versions of my mail so that an attacker can’t append hostile HTML.

It’s a pity that Gmail (which appears to have the most popular implementation of DKIM) doesn’t allow setting that option. So far the only message I have received that failed DKIM checks was sent from a Gmail account to a Debian mailing list.

Ideally it would be possible to have the messages sent to mailing lists not be signed or have the length field used. That would require a signing practice based on recipient which is functionality is not available in dkim-milter (but which possibly could be implemented as Postfix configuration although I don’t know how). Implementing this would not necessarily require knowing all the lists that mail might be sent to, it seems that a large portion of the world’s list traffic is sent to addresses that match the string “@lists.” which can be easily recognised. For a service such as Gmail it would be easy to recognise list traffic from the headers of received messages and then treat messages sent to those domains differently.

As the signature status is based on the sending address it would be possible for me to use different addresses for sending to mailing lists to avoid the signatures (the number of email addresses in use in my domain is small enough that having a line for each address to sign will not be a great inconvenience). Most MUAs have some sort of functionality to automatically choose a sender address that is in some way based on the recipient address. I’ll probably do this eventually, but for the moment I’ll just use the BodyLengths option, while it does reduce the security a bit it’s still a lot better than having no DKIM checks.

Debugging as a Demonstration Sport

I was watching So You Think You Can Dance [1] and thinking about the benefits that it provides to the dancing industry. The increase in public appreciation for the sport will increase the amount of money that is available to professionals, and getting more people interested in dancing as a profession will increase the level of skill in the entire industry. While the show hasn’t interested me much (I prefer to watch dancing in the context of music videos and avoid the reality TV aspect) I appreciate what it is doing. On a more general note I think that anything which promotes interest in the arts is a good thing.

I have been wondering whether similar benefits can be provided to the IT industry through competitions. There are some well established programming contests aimed at university level students and computer conferences often have contests. But the down-side of them in terms of audience interest is that they are either performed in exam conditions or performed over the course of days – neither of which makes for good viewing. The audience interaction is generally limited to the award ceremony and maybe some blog posts by the winners explaining their work.

There are a number of real-world coding tasks that can be performed in a moderate amount of time. One example is debugging certain classes of bugs, this includes memory leaks, SEGVs, and certain types of performance and reliability problems. Another is fixing man pages.

A way of running such a contest might be to have a dozen contestants on stage with their laptops connected to a KVM switch. They could choose tasks from the bug list of their favorite distribution, and when they completed a task (built a deb or rpm package with the bug fixed and updated the bug report with a patch) they could request to have their port on the KVM switch and their microphone enabled so that they could explain to the audience what they did.

Points would be awarded based on the apparent difficulty of the bug and the clarify of the explanation to the audience. A major aim of such an exercise would be to encourage members of the audience to spend some of their spare time fixing bugs!

Basically it would be a public Bug Squashing Party (BSP) but with points awarded and some minor prizes (it would be best to avoid significant prizes as that can lead to hostility).

RSS Aggregation Software

The most commonly installed software for aggregating RSS feeds seems to be Planet and Venus (two forks of the same code base). The operation is that a cron job runs the Python program which syndicates a list of RSS feeds and generates a static web page. Of course the problems start if you have many feeds as polling each of them (even the ones that typically get updated at most once a week) can take a while. My experience with adding moderate numbers of feeds (such as all the feeds used by Planet Debian [1]) is that it can take as much as 30 minutes to poll them all – which will be a problem if you want frequent updates.

Frequent polling is not always desired, it means more network load and a greater incidence of transient failures. Any error in updating a feed is (in a default configuration) going to result in an error message being displayed by Planet, which in a default configuration will result in cron sending an email to the sysadmin. Even with an RSS feed being checked every four hours (which is what I do for my personal Planet installations) it can still be annoying to get the email when someone’s feed is offline for a day.

Now while there is usually no benefit in polling every 15 minutes (the most frequent poll time that is commonly used) there is one good reason for doing it if you can only poll. The fact that some people want to click reload on the Planet web page every 10 minutes to look for new posts is not a good reason (it’s like looking in the fridge every few minutes and hoping that something tasty will appear). The good reason for polling frequently is to allow timely retraction of posts. It’s not uncommon for bloggers to fail to adequately consider the privacy implications of their posts (let’s face it – professional journalists have a written code of ethics about this, formal training, an editorial board, and they still get it wrong on occasion – it’s not easy). So when a mistake is made about what personal data should be published in a blog post it’s best for everyone if the post can be amended quickly. The design of Planet is that when a post disappears from the RSS feed then it also disappears from the Planet web page, I believe that this was deliberately done for the purpose of removing such posts.

The correct solution to the problem of amending or removing posts is to use the “Update Services” part of the blog server configuration to have it send an XML RPC to the syndication service. That can give an update rapidly (in a matter of seconds) without any polling.

I believe that a cron job is simply the wrong design for a modern RSS syndication service. This is no criticism of Planet (which has been working well for years for many people) but is due to the more recent requirements of more blogs, more frequent posting, and greater importance attached to blogs.

I believe that the first requirement for a public syndication service is that every blogger gets to specify the URL of their own feed to save the sysadmin the effort of doing routine URL changes. It should be an option to have the server act on HTTP 301 codes and record the new URL in the database. Then the sysadmin would only have to manage adding new bloggers (approving them after they have created an account through a web-based interface) and removing bloggers.

The problem of polling frequency can be mostly solved by using RPC pings to inform the server of new posts if the RPC mechanism supports removing posts. If removing posts is not supported by the RPC then every blog which has an active post would have to be polled frequently. This would reduce the amount of polling considerably, for example there are 319 blogs that are currently syndicated on Planet Debian, there are 60 posts in the feed, and those posts were written by 41 different people. So if the frequent polling to detect article removal was performed for active articles, given the fact that you poll the bloggers feed URL not the article that would only mean 41 polls instead of 319 – reducing the polling by a factor of more than 7!

Now even with support for RPC pings there is still a need to poll feeds. One issue is that feeds may experience temporary technical difficulty in sending the RPC as we don’t want to compel the authors of blog software to try and make the ping as reliable a process as sending email (if that was the requirement then a ping via email might be the best solution). The polling frequency could be implemented on a per-blog basis based on the request of the blogger and the blog availability and posting frequency. Someone who’s blog has been down for a day (which is not uncommon when considering a population of 300 bloggers) could have their blog polled on a daily basis. Apart from that the polling frequency could be based on the time since the last post. It seems to be a general pattern that hobby bloggers (who comprise the vast majority of bloggers syndicated in Planet installations) often go for weeks at a time with no posts and then release a series of posts when they feel inspired.

In terms of software which meats these requirements, the nearest option seems to be the the Advogato software mod_virgule [2]. Advogato [3] supports managing accounts with attached RSS feeds and also supports ranking blogs for a personalised view. A minor modification of that code to limit who gets to have their blog archived, and fixing it so that a modified post only has the latest version stored (not both versions as Advogato does) would satisfy some of these requirements. One problem is that Advogato’s method of syndicating blogs is to keep an entire copy of each blog (and all revisions). This goes against the demands of many bloggers who demand that Planet installations not keep copies of their content for a long period and not have any permanent archives. Among other things if there are two copies of a blog post then Google might get the wrong idea as to which is the original.

Does anyone know of a system which does better than Advogato in meeting these design criteria?

Software has No Intrinsic Value

In a comment on my Not All Opinions Are Equal [1] post AlphaG said “Anonymous comments = free software, no intrinsic value as you got it for nothing”.

After considering the matter I came to the conclusion that almost all software has no intrinsic value (unless you count not being sued for copyright infringement as intrinsic value). When you buy software you generally don’t get a physical item (maybe a CD or DVD), to increase profit margins manuals aren’t printed for most software (it used to be that hefty manuals were shipped to give an impression that you were buying a physical object). Software usually can’t be resold (both due to EULA provisions and sites such as EBay not wanting to accept software for sale) and recently MS has introduced technical measures that prevent even using it on a different computer (which force legitimate customers to purchase more copies of Windows when they buy new hardware but doesn’t stop pirates from using it without paying). Even when software could be legally resold there were always new versions coming out which reduced the sale price to almost zero in a small amount of time.

The difference between free software and proprietary software in terms of value is that when you pay for free software you are paying for support. This therefore compels the vendor to provide good support that is worth the money. Vendors of proprietary software have no incentive to provide good support – at least not unless they are getting paid a significant amount of money on top of the license fees. This is why Red Hat keeps winning in the CIO Vendor Value Studies from CIO Insight [2]. Providing value is essential to the revenue of Red Hat, they need to provide enough value in RHEL support that customers will forgo the opportunity to use CentOS for free.

Thinking of software as having intrinsic value leads to the error of thinking of software purchases as investments. Software is usually outdated in a few years, as is the hardware that is used to run it. Money spent on software and hardware should be considered as being a tax on doing business. This doesn’t mean that purchases should be reduced to the absolute minimum (if systems run slowly they directly decrease productivity and also cause a loss of morale). But it does mean that hardware purchases should not be considered as investments – the hardware will at best be on sale cheap at an auction site in 3-5 years, and purchases of proprietary software are nothing but a tax.

A Revolution Done Right

Amaya writes about the fact that the political process in many countries is extremely flawed and is failing their citizens [1] (although she doesn’t actually express it in that way). She asks how a revolution can be done right.

If we look at the historical record, after the French Revolution came the Reign of Terror [2], after the English Civil War Oliver Cromwell [3] took power – his actions are widely regarded as genocidal, and as for the Chinese and Russian revolutions – it seems that the majority of the population didn’t benefit much (if at all) from them. Generally it seems that the only times that a revolution seems to give a good result is when the situation was really bad before AND when the government failed basic measures such as ensuring food supplies.

The independence for the Indian sub-continent which derived from Gandhi’s work can be used as a counter-example. However the ongoing low-level warfare between India and Pakistan is due to a failure of the process.

It seems to me that the required first step towards changing a rotten political system with a minimum of bloodshed is to improve communications. If the majority of the citizens know what is really happening in their own country, how their standards of living compare with those in other countries, and what deals are made between their government and the governments of other countries then they can attempt to work out the best way to improve things.

The free software community is already doing a good job of facilitating communications. The key areas are to have computers that act on behalf of their users (not using proprietary implementations and Digital Restrictions Management to make them act on behalf of corporations and the state), to support strong encryption with public implementations, to be generally as secure as possible, and to run on the cheapest possible hardware so that everyone gets access.

Update: Corrected the spelling of Gandhi [4] – thanks Rick Moen.

The Problem is Too Many Remote Controls

I am often asked for advice about purchasing TVs and consumer electronics. Not that I am any great expert in those areas, but my general experience in specifying and purchasing electronics goods in related to my computer work does translate to other areas (and I know where to find advice on the net).

As part of this I get to closely observe what happens when people install new home entertainment systems. I have not observed anyone who uses anywhere near the full function of their system, even using the bare minimum functionality in all areas is very rare.

I believe that the first problem with this is the input devices. A remote control is designed around the idea of one button for each high-level command, this can be compared to languages such as Chinese and Japanese which have one character per word. In terms of language evolution it seems that the benefits of having multiple characters became apparent and widely accepted thousands of years ago.

Now for a simple input device having one button for each high-level operation makes sense. For the basic functions of a VCR you have PLAY, STOP, Fast-Forward, Rewind, Record, and Eject – 6 buttons is quite reasonable. But then you want slightly advanced features such as “record from 8PM to 9PM on channel A and then record from 10PM to 11PM on channel B” and things become unreasonably difficult. More than 10 years ago I was recommending that people just buy 5 hour tapes and press record before leaving home, getting the full functionality out of a VCR was just too hard. Jokes are often made about people who leave their VCR flashing 12:00 (because it’s too difficult to set the time), I only set the time on a VCR to stop the flashing (flashing lights annoy me).

Since programmable VCRs became popular things have only continued to develop. Now it’s not uncommon to have separate remote controls for the TV, VCR, DVD player, and the Cable TV box – a total of four remote controls! This is obviously a problem, and the solution that some people are adopting is to have a single more complex remote control – this is an example of problems not being solved by the same type of thinking that caused them.

One of the attempts to solve this is to have everything controlled by a PVR [1]. This means that you have one device recording content (maybe from multiple sources or multiple channels), playing recorded content or live content, and maybe playing DVDs and CDs. Of course then you have a complex remote control for that device which just shifts the problem.

To solve these problems we need to consider the entire problem space from a high-level. We start with the signal that is displayed by the TV, it can come from cable TV, digital TV, analogue TV, VCR, or DVD – these input sources have names such as “Composite”, “RCA1”, “RCA2”, “RCA3”, “DTV”, and “ATV”. Often people have written instructions near their TV to map these names to what they do.

Obviously the TV needs to be able to be programmed with human friendly names, and as these names are not of much use on their own it should be possible to use compound names and abbreviations. If I want to watch ABC (the Australian Government sponsored TV channel) then I would rather type “ABC” on a keyboard and then have the entertainment system determine whether that maps to “cable:abc”, “dtv:abc”, or “atv:abc” depending on what options are available. The current process of choosing an input source (such as RCA1 mapping to cable TV) and then choosing a channel (102 mapping to ABC) means among other things that it is essentially impossible for a guest to control the TV.

A further problem is the lack of available output. While it might seem that a large-screen HDTV has adequate output available, it’s often the case that you don’t want to stop watching one show while trying to find another. When something of wide interest is happening (another war or an election) it’s common for several people in one room to want to watch the news. Having everyone stop watching while someone goes through menus to find a different news channel is not well accepted. It seems to me that we need a separate output mechanism for controlling the entire system from that which is used for the main display.

This of course requires integration between all parts of the entertainment system, which shouldn’t be that difficult given the complexity of all the components (every one of which has more compute power than a typical server of 20 years ago). It is currently quite common for PVRs and DVD players to support playing videos from USB and SD devices, so the next logical step is to get an Ethernet port in each device (and maybe have Ethernet switches built in to some of the high-end entertainment hardware). Then XML transferred over HTTP could be used for controlling the different components from a single web server which provides a web-based GUI. While a random guest would not get much functionality out of my TV configuration (or that of most homes where I have assisted in the configuration of new hardware), they shuold be able to use a web-based GUI with ease.

For controlling the entire system a NetBook [2] computer such as the EeePC should do well. As high-end TVs cost over $5,000 an EeePC which costs $500 (list price – surely less in bulk) could easily be bundled without much impact on the overall price in the same way that companies selling computers in the $1,000,000 price range used to bundle $10,000 workstations to manage them. An EeePC has abut the same size and mass as four remote controls (so it wouldn’t be inconvenient to use). Also the built-in wifi could be used for the remote control (wires are inconvenient and Infra-Red has limited bandwidth so probably wouldn’t provide a good web based GUI). Also someone who wanted to save some money could instead choose to use an old laptop on a coffee table (any web browser would do). I have deployed Linux desktop machines for some people who had no trouble using it, but who then had trouble using TVs that I configured for them – so I conclude that a modern Linux distribution is significantly more user friendly than the equipment that you find in a modern lounge room.

Cable TV companies all seem to be ISPs and often provide compelling value as an ISP service if you want to use the TV service. So it seems that the “Cable Modem” could be built in to the Cable TV box for added savings in hardware purchase and less boxes to manage in the home. This of course would increase the value of a NetBook [2] as a remote control as it could also be used for general Internet access at the same time. TV shows often promote their web sites to their customers and TV advertising also often includes a URL. If the URLs were distributed by VideoText then that would provide more information for the viewers, a better reach for advertisers and people who create the TV shows, and it when it became popular it would save us fom those stupid scrolling news tickers that appear on the bottom of most cable new shows!

Some of these benefits are currently achieved by people running MythTV. The first problem with Myth is that it is still largely stuck to the bad old UI paradigm (which of course can be fixed as it’s free software). The next problem is that getting a result of comparable quality to the dedicated hardware is expensive, difficult, or both. A regular PC makes more noise than you desire in your lounge room and configuring a Myth machine for your choice of DTV hardware and the frequencies used in your location is a pain. You can buy a preconfigured machine which solves these problems, but it will be more expensive. For most lounge rooms the cheap Chinese hardware is what will be used.

Ultimately I believe that TV will be killed by the Internet. The range of content on the net is increasing and the rate of acceleration is also increasing. TV seems to have not made any significant changes since the introduction of cable and satellite TV (both of which happened a long time ago and were not significant changes). But I don’t expect it to happen soon. I predict another 10 years of the current TV business model. I believe that better integration of home entertainment hardware so that it can obey simpler commands from the user in a more friendly manner while not getting in the way of the main purpose (displaying the TV picture) has the potential to extend the life of the current TV business model.

Not that I care whether the TV industry dies sooner or later. I just want to escape from having to provide phone-support to people who can’t get their TV to work correctly!

Laptop Computer Features

It’s not easy to choose a laptop, and part of the problem is that most people don’t seem to start from the use of the laptop. I believe that the following four categories cover the vast majority of the modern use of mobile computers.

  1. PDA [1] – can be held in one hand and generally uses a touch-screen. They generally do not resemble a laptop in shape or design.
  2. Subnotebook [2] AKA Netbook [3] – a very small laptop that is designed to be very portable. Typically weighing 1KG or less and having multiple design trade-offs to give light weight (such as a small screen and a slow CPU) while still being more powerful than a PDA. The EeePC [4] is a well known example.
  3. Laptop [5] – now apparently defined to mean a medium size portable computer, light enough to be carried around but with a big enough screen and keyboard that many people will be happy to use them for 8 hours a day. The word is also used to mean all portable computers that could possibly fit on someone’s lap.
  4. Desktop Replacement [6] – a big heavy laptop that is not carried much.

There is some disagreement about the exact number of categories and which category is most appropriate for each machine. There is a range of machines between the Subnotebook and Laptop categories. There is some amount of personal preference involved in determining which category a machine might fall in. For example I find a Thinkpad T series to fit into the “Laptop” category (and I expect that most people would agree with me). But when comparing the weight and height of an average 10yo child to an adult it seems that a 10yo would find an EeePC to be as much effort to carry as a T series Thinkpad is for an adult.

It seems to me that the first thing that you need to do when choosing a laptop is to decide which of the above categories is most appropriate. While the boundaries between the categories are blurry and to some extent are limited by personal preference it’s an easy second step to determine which machines fit the category you have selected (in your opinion) once you have made a firm decision on the category. It’s also possible to choose a half-way point, for example if you wanted something on the border of the “Laptop” and NetBook categories then a Thinkpad X series might do the job.

The next step of course is to determine which OSs and applications you want to run. There are some situations where the choice of OS and/or applications may force you to choose a category that has more powerful hardware (a CPU with more speed or features, more RAM, or more storage). For example a PDA generally won’t run a regular OS well (if at all) due to the limited options available for input devices and the very limited screen resolution. Even a NetBook has limitations as to what software runs well (for example many applications require a minimum resolution of 800×600 and don’t work well on an EeePC 701). Also Xen can not be used on the low-end CPUs used in some NetBooks which lack PAE.

Once you have chosen a category you have to look for features which make sense for that category. A major criteria for a PDA is how fast you can turn it on, it should be possible to go from standby to full use in less than one second. Another major criteria is how long the battery lasts, it should compare to a mobile phone (several days on standby and 8 hours of active use). A criteria that is important to some people is the ability to use both portrait and landscape views for different actions (I use portrait for editing and landscape for reading).

A NetBook is likely to be used in many places and needs to have a screen that will work well in adverse lighting conditions (a shiny reflective screen is a really bad thing), it also needs to be reasonably resilient as it is going to get bumped if it is transported a lot (a solid state disk is a good feature). It should also be as light as possible while having enough hardware to run a regular OS (an EeePC 701 with 512M of RAM and 4G of storage is about the minimum hardware for running a regular distribution of Linux).

A desktop replacement needs to have all the features, lots of RAM, a fast CPU and video hardware, and a big screen – it also needs a good keyboard (test by typing your name several times). The “Laptop” category is much the same as the desktop replacement, but a bit smaller, a lot lighter, and better battery life.

It seems very difficult to give any specific advice as to which laptop to buy when the person who wants the advice has not chosen a category (which is often the case).

Not All Opinions Are Equal

It seems to be a common idea by non-bloggers that the comment they enter on a blog is somehow special and should be taken seriously by the author of the blog (everyone is a legend in their own mind). In a recent discussion one anonymous commentator seemed offended that I didn’t take his comments seriously and didn’t understand why I would take little notice of an anonymous comment while taking note of a later comment on the same issue by the author of the project in question.

In most forums (and I use the term in the broadest way) an anonymous comment is taken with a weight that is close to zero. That doesn’t mean that it will be ignored, it just means that the requirement for providing supporting evidence or of having a special insight and explaining it is much greater.

One example of this is the comment weighting system used by Slashdot.org (AKA “/.”). The /. FAQ has a question “Why should I log in?” with the answer including “Posting in Discussions at Score:1 instead of Score:0 means twice as many people will see your comments” [1]. /. uses the term “Anonymous Coward” as the identification of users who are not logged in, this gives an idea of how they are regarded.

Advogato uses a rating method for blog posts which shows you only posts from blogs that you directly rank well or which match the trust metric (based on rankings of people you rank) [2].

I believe that the automated systems developed by /. and other online forums emulate to a some extent the practices that occur off-line. For any discussion in a public place a comment from someone who does not introduce themself (or gives an introduction that gives no reason to expect quality) will be treated with much less weight than one from someone who is known. When someone makes a comment their background will be considered by people who hear it. If a comment is entirely a matter of opinion and can not be substantiated by facts and logical analysis then the acceptance of the comment is solely based on the background of the author (and little things like spelling errors can count against the author).

Therefore if you want your blog comments to be considered by blog authors and readers you need to make sure that you are known. Using your full name is one way of not being as anonymous but most names are not unique on the Internet (I’ve previously described some ways of ensuring that you beat other people with the same name in Google rankings [3]). The person who owns the blog can use the email address that is associated with the comment to identify the author (if it’s a real email address and it’s known by Google). But for other readers the only option is the “Website” field. The most common practice is to use the “Website” field in the comment to store the URL of your blog (most blog comments are written by bloggers). But there is nothing stopping you from using any other URL, if you are not a blogger and want to write comments on blogs you could create a personal web page to use for the comments. If the web page you use for such purposes gives links to references as to your relevant experience then that would help. Someone who has skills in several areas could create a web page for each one and reference the appropriate page in their comment.

One problem we face is that it is very easy to lie on the net. There is no technical obstacle to impersonation on the net, while I haven’t seen any evidence of people impersonating others in an attempt to add credibility to blog comments I expect it’s only a matter of time before that happens (I expect that people do it already but the evidence of them getting caught has not been published anywhere that I’ve read). People often claim university education to add weight to their comments (usually in email but sometimes in blog comments too). One problem with this is that anyone could falsely claim to have a university degree and no-one could disprove their claim without unreasonable effort, another is that a university degree actually doesn’t mean much (lots of people remain stupid after graduating). One way in which adding a URL to a comment adds weight is that for a small web site the author will check a reasonable portion of the sites that link to them, so if someone impersonates me and has a link to my web site in the comment then there’s a good chance that I will notice this.

OpenID [4] has the potential to alleviate this by making it more difficult to forge an association with a web site. One thing that I am working on is enabling OpenID on all the web sites that are directly associated with me. I plan to use a hardware device to authenticate myself with the OpenID server (so I can securely enter blog comments from any location). I expect that it will become the standard practice that comments will not be accepted by most blogs if they are associated with a URL that is OpenID enabled unless the author of the comment authenticates themself via OpenID.

Even when we get OpenID enabled everywhere there is still the issue of domain specific expertise. While I am well enough known for my work on SE Linux that most people will accept comments about it simply because I wrote them, the same can not be said for most topics that I write about. When writing about topics where I am not likely to be accepted as an expert I try and substantiate my main points with external web pages. Comments are likely to be regarded as spam if they have too many links so it seems best to only use one link per comment – which therefore has to be on an issue that is important to the conclusion and which might be doubted if evidence was not provided. The other thing that is needed is a reasonable chain of deduction. Simply stating your opinion means little, listing a series of logical steps that led you to the opinion and are based on provable facts will hold more weight.

These issues are not only restricted to blog comments, I believe that they apply (to differing degrees) to all areas of online discussion.