Integrity and Mailing Lists

One significant dividing factor between mailing lists is the difference between summary lists (where the person who asks a question receives replies off-list and then sends a summary to the list) and the majority of mailing lists which are discussion lists (where every reply goes to the list by default).

I have seen an argument put forward that trusting the answers on a mailing list that operates under the summary list model is inherently risky and that peer review is required.

It could be argued that the process of sending a summary to the list is the peer review. I’m sure that if someone posts a summary which includes some outrageously bad idea then there will be some commentary in response. Of course the down-side to this is that it takes a few days for responses to the question to arrive and as it’s common that computer problems need to be solved in hours not days the problem will be solved (one way or another) before the summary message is written. But the idea of peer review in mailing lists seems to fall down in many other ways.

The first problem with the idea of peer review is that the usual aim of mailing lists is that most people will first ask google and only ask the list if a reasonable google search fails (probably most mailing lists would fall apart under the load of repeated questions otherwise). Therefore I expect that the majority of such problems to be solved by reading a web page (with no peer review that is easily accessible). Some of those web pages contain bad advice and part of the skill involved in solving any problem relates to recognising which advice to follow. Also it’s not uncommon for a question on a discussion list to result in a discussion with two or more radically different points of view being strongly supported. I think that as a general rule there is little benefit in asking for advice if you lack any ability to determine whether the advice is any good, and which of the possible pieces of good advice actually apply to your situation. Sometimes you can recognise good advice by the people who offer it, in a small community such as a mailing list it’s easy to recognise the people who have a history of offering reasonable advice. It seems that the main disadvantage of asking google when compared to asking a mailing list is that the google results will in most cases contain links to web sites written by people who you don’t know.

Sometimes the advice is easy to assess, for example if someone recommends a little-known and badly documented command-line option for a utility it’s easy to read the man page and not overly difficult to read the source to discover whether it is a useful solution. Even testing a suggested solution is usually a viable option. Also it’s often the case that doing a google search on a recommended solution will be very informative (sometimes you see web pages saying “here’s something I tried which failed”). Recommendations based on personal experience are less reliable due to statistical issues (consider the the regular disagreements about the reliability of hard disks where some people claim that RAID is not necessary due to not having seen failures while others claim that RAID-5 is inadequate because it has failed them). There are also issues of different requirements, trivial issues such as the amount of money that can be spent will often determine which (if any) of the pieces of good advice can be adopted.

The fact that a large number of people (possibly the majority of Internet users) regularly forward as fact rumors that are debunked by Snopes.com (the main site for debunking urban legends) seems to indicate that it is always going to be impossible to increase the quality of advice beyond a certain level. A significant portion of the people on the net are either unwilling to spend a small amount of effort in determining the accuracy of information that they send around or are so gullible that they believe such things beyond the possibility of doubt. Consider that the next time you ask for advice on a technical issue, you may receive a response from someone who forwarded a rumor that was debunked by Snopes.

Sometimes technical advice is just inherently dangerous because it is impossible to verify the integrity of some code that is being shared, or because it may be based on different versions of software. In a previous blog post I analyse some issues related to security of the Amazon EC2 service [1]. While the EC2 service is great in many ways (and implements a good well-documented set of security features on the servers) the unsigned code for managing it and the old versions of the images that they offer to customers raise some serious issues that provide avenues for attack. Getting the EC2 management tools to work correctly on Debian is not trivial, I have released patches but will not release packages for legal reasons. It seems most likely to me that someone will release packages based on my patches (either because they don’t care about the legal issues or they have legal advice suggesting that such things are OK – maybe due to residing in a different jurisdiction). Then people who download such packages will have to determine whether they trust the person who built them. They may also have the issue of Amazon offering a newer version of the software than that which is packaged for Debian (for all I know Amazon released a new version yesterday).

The term integrity when applied to computers refers to either accidental or malicious damage to data [2]. In the context of mailing list discussions this means both poorly considered advice and acts of malice (which when you consider spam and undisclosed conflicts of interest are actually quite common).

If you ask for advice in any forum (and I use the term in it’s broadest sense to cover web “forums”, IRC, twitter, etc) then getting a useful result will depend on having the majority of members of the forum possessing sufficient integrity and skill, being able to recognise the people whose advice should be followed, or being able to recognise good advice on it’s own.

I can think of few examples of forums of which I have been involved where the level of skill was sufficient to provide quality answers (and refutations for bad answers) for all areas of discussion that were on topic. People whose advice should generally be followed will often offer advice on areas where their skills are less well developed, someone whose advice can be blindly followed in regard to topic A may not be a reliable source for advice on topic B – which can cause confusion if the topics in question are closely related.

Finally a fundamental difference between “peer review” (as applied to conferences and academic journals) is that review for conferences and journals is conducted before the presentation. Not only does the work have to be good enough to pass the review, but the people doing it will never be sure what the threshold is (and will generally want to do more than a minimal effort) so the quality will be quite high. While peer review in mailing lists is mostly based around the presence or absence of flames. A message which doesn’t attract flames will either have some minimal quality or be related to a topic that is not well known (so no-one regards it as being obviously wrong).

Update: The “peer review” process of publishing a post on my blog revealed that I had incorrectly used who’s instead of whose.

etbe – Russell Coker

Archives

Categories

Integrity and Mailing Lists

Archives

Email and RSS

etbe – Russell Coker

Archives

Categories

Tags

Integrity and Mailing Lists

Archives

Email and RSS