Archives

Categories

Increasing Efficiency through Less Work

I have just read an interesting article titled Why Crunch Mode Doesn’t Work [1] which documents the research on efficiency vs amount of time spent working (and by inference amount of time spent on leisure activities and sleep). It shows that a 40 hour working week was chosen by people who run factories (such as Henry Ford) not due to being nice for the workers but due to the costs of inefficient work practices and errors that damage products and equipment.

Now these results can only be an indication of what works best by today’s standards. The military research is good but only military organisations get to control workers to that degree (few organisations try to control how much sleep their workers get or are even legally permitted to do so), companies can only give their employees appropriate amounts of spare time to get enough sleep and hope for the best.

Much of the research dates from 80+ years ago. I suspect that modern living conditions where every house has electric lights and entertainment devices such as a TV to encourage staying awake longer during the night will change things, as would ubiquitous personal transport by car. It could be that for modern factory workers the optimum amount of work is not 40 hours a week, it could be as little as 30 or as much as 50 (at a guess).

Also the type of work being done certainly changes things. The article notes that mental tasks are affected more than physical tasks by lack of sleep (in terms of the consequences of being over-tired), but no mention is made about whether the optimum working hours change. If the optimum amount of work in a factory is 40 hours per week might the optimum for a highly intellectual task such as computer programming be less, perhaps 35 or 30?

The next factor is the issue of team-work. In an assembly-line it’s impossible to have one person finish work early while the rest keep working, so the limit will be based on the worker who can handle the least hours. Determining which individuals will work more slowly when they work longer hours is possible (but it would be illegal to refuse to hire such people in many jurisdictions) and determining which individuals might be more likely to cause industrial accidents may be impossible. So it seems to me that the potential for each employee to work their optimal hours is much greater in the computer industry than in most sectors. I have heard a single anecdote of an employee who determined that their best efficiency came from 5 hours work a day and arranged with their manager to work 25 hours a week, apart from that I have not heard any reports of anyone trying to tailor the working hours to the worker.

Some obvious differences in capacity for working long hours without losing productivity seem related to age and general health, obligations outside work (EG looking after children or sick relatives), and enjoyment of work (the greater the amount of work time that can be regarded as “fun” the less requirement there would be for recreation time outside work). It seems likely to me that parts of the computer industry that are closely related to free software development could have longer hours worked due to the overlap between recreation and paid work.

If the amount of time spent working was to vary according to the capacity of each worker then the company structures for management and pay would need to change. Probably the first step towards this would be to try to pay employees according to the amount of work that they do, one problem with this is the fact that managers are traditionally considered to be superior to workers and therefore inherently worthy of more pay. As long as the pay of engineers is restricted to less than the pay of middle-managers the range between the lowest and highest salaries among programmers is going to be a factor of at most five or six, while the productivity difference between the least and most skilled programmers will be a factor of 20 for some boring work and more than 10,000 for more challenging work (assuming that the junior programmer can even understand the task). I don’t expect that a skillful programmer will get a salary of $10,000,000 any time soon (even though it would be a bargain compared to the number of junior programmers needed to do the same work), but a salary in excess of $250,000 would be reasonable.

If pay was based on the quality and quantity of work done (which as the article mentions is difficult to assess) then workers would have an incentive to do what is necessary to improve their work – and with some guidance from HR could adjust their working hours accordingly.

Another factor that needs to be considered is that ideally the number of working hours would vary according to the life situation of the worker. Having a child probably decreases the work capacity for the next 8 years or so.

These are just some ideas, please read the article for the background research. I’m going to bed now. ;)

Load Average

Other Unix systems apparently calculate the load average differently to Linux. According to the Wikipedia page about Load(computing) [1] most Unix systems calculate it based on the average number of processes that are using a CPU or available for scheduling on a CPU while Linux also includes the count of processes that are blocked on disk IO (uninterruptible sleep).

There are three load average numbers, the first is for the past minute, the second is for the past 5 minutes, and the third is for the past 15 minutes. In most cases you will only be interested in the first number.

What is a good load average depends on the hardware. For a system with a single CPU core a load average of 1 or greater from CPU use will indicate that some processes may perform badly due to lack of CPU time – although a long-running background process with a high “nice” value can increase the load average without interfering with system performance in most cases. As a general rule if you want snappy performance then the load average component from CPU use should be less than the number of CPU cores (not hyper-threads). For example a system with two dual-core CPUs can be expected to perform really well with a load average of 3.5 from CPU use but might perform badly with a load average of 5.

The component of the load average that is due to disk IO is much more difficult to interpret in a sensible manner. A common situation is to have the load average increased by a NFS server with a network problem. A user accesses a file on the NFS server and gets no response (thus giving a load average of 1), they then open another session and use “ls” to inspect the state of the file – ls is blocked and gives a system load average of 2. A single user may launch 5 or more processes before they realise that they are not going to succeed. If there are 20 active users on a multi-user system then a load average of 100 from a single NFS server that has a network problem is not uncommon. While this is happening the system will perform very well for all tasks that don’t involve the NFS server, the processes that are blocked on disk IO can be paged out so they don’t use any RAM or CPU time.

For regular disk IO you can have load average incremented by 1 for each non-RAID disk without any significant performance problems. For example if you have two users who each have a separate disk for their home directory (not uncommon with certain systems where performance is required and cooperation between users is low) then each could have a single process performing disk IO at maximum speed with no performance problems for the entire system. A system which has four CPU cores and two hard drives used for separate tasks could have a load average slightly below 6 and the performance for all operations would be quite good if there were four processes performing CPU intensive tasks and two processes doing disk intensive tasks on different disks. The same system with six CPU intensive programs would under-perform (each process would on average get 2/3 of a CPU), and if it had six disk intensive tasks that all use the same disk then performance would be terrible (especially if one of the six was an interactive task).

The fact that a single load average number can either mean that the system is busy but performing well, under a bit of load, or totally overloaded means that the load average number is of limited utility in diagnosing performance problems. It is useful as a quick measure, if your server usually has a load average of 0.5 and it suddenly gets a load average of 10 then you know that something is wrong. Then the typical procedure for diagnosing it starts with either running “ps aux|grep D” (to get a list of D state processes – processes that are blocked on disk IO) or running top to see the percentages of CPU time idle and in IO-wait states.

Cpu(s): 15.0%us, 35.1%sy,  0.0%ni, 49.9%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
7331 rjc      25  0  2868  640  312 R  100  0.0  0:21.57 gzip

Above is a section of the output of top showing a system running gzip -9 < /dev/urandom > /dev/null. Gzip is using one CPU core (100% CPU means 100% of one core – a multi-threaded program can use more than one core and therefore more than 100% CPU) and the overall system statistics indicate 49.9% idle (the other core is almost entirely idle).

Cpu(s):  1.3%us,  3.2%sy,  0.0%ni, 50.7%id, 44.4%wa,  0.0%hi,  0.3%si,  0.0%st

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
7425 rjc      17  0  4036  872  588 R    4  0.1  0:00.20 find

Above is a section of the output of top showing the same system running find /. The system is registering 44% IO wait and 50.7% idle. The IO wait is the percentage of time that CPU core is waiting on IO, so 44% of the total system CPU time (or 88% of one CPU core) is idle while the system is waiting for disk IO to complete. A common mistake is to think that if the IO was faster then more CPU time would be used, in this case with the find program using 4% of one CPU core if all the IO was instantaneous (EG in cache) then the command would complete 25 times faster with 100% CPU use. But if the disk IO performance was doubled (a realistic possibility given that the system has a pair of cheap SATA disks in a RAID-1) then find would probably use 8% of CPU time.

Really the only use for load average is for getting an instant feel for whether there are any performance problems related to CPU use or disk IO. If you know what the normal number is then a significant change will stand out.

Dr. Neil Gunther has written some interesting documentation on the topic [2], which goes into more technical detail including kernel algorithms used for calculating the load average. My aim in this post is to educate Unix operators as to the basics of the load average.

His book The Practical Performance Analyst gives some useful insights into the field. One thing I learned from his book is the basics of queueing theory. One important aspect of this is that as the rate at which work arrives approaches the rate at which work can be done the queue length starts to increase exponentially, and if work keeps arriving at the same rate when the queue is full and the system can’t perform the work fast enough the queue will grow without end. This means that as the load average approaches the theoretical maximum the probability of the system dramatically increasing it’s load average increases. A machine that’s bottlenecked on disk IO for a task where there is a huge number of independent clients (such as a large web server) may have it’s load average jump from 3 to 100 in a matter of one minute. Of course this won’t mean that you actually need to be able to serve 30 times the normal load, merely slightly more than the normal load to keep the queues short. I recommend reading the book, he explains it much better than I do.

Update: Jon Oxer really liked this post.

Halloween

Yesterday I received an unsigned notice in the mail from some residents of an area comprising my street and an adjacent one. They advised me that their children were going to do the Halloween thing and that if I wanted to be involved I should leave my porch light on. This is a really good idea, people who like that sort of thing can leave their light on and give lollies to children (whatever happened to “don’t take lollies from strangers”?). People who don’t like pagan festivals can leave their porch light off and not be bothered.

One thing that occurred to me is that the anonymous flyer might have been posted by someone who doesn’t like Halloween to provide a way for themselves and like-minded people to opt-out of it. I’m sure that everyone told their children not to knock on a door unless there is a porch light on.

It’s almost 10PM and no-one has rung my doorbell, it seems to have worked.

SecureCon 2007

I am running a tutorial and giving a talk about SE Linux at SecureCon 2007 [1].

The tutorial will go for 3 hours on Wednesday the 7th of November and will cover using SE Linux in CentOS 5 and Debian Etch, it will be a hands-on tutorial where every delegate gets ssh access to their own Xen DomU.

The lecture is on Thursday the 8th of November and will be a 45 minute talk with an overview of SE Linux. It will be similar to my speech at the AUUG conference [2] but probably cover more of the features. The AUUG talk was driven by questions from the audience to spend a lot of time justifying SE Linux design decisions which took time away from other materiel. This wasn’t inherently a problem (I provided the information the audience seemed to want and everyone seemed happy), but I would like to cover more of the features and new developments.

New SE Linux Play Machine Online

After over a year I have finally got a SE Linux Play Machine online again.

The details for logging in are at this link [1]. I’ve created T-shirt and mug designs with the login details too, they are on cafepress.com LINK [2]. For fun wear such a shirt to a conference (or even when shopping at your local electronics store. ;)

Xen and Security

I have previously posted about the difference between using a chroot and using SE Linux [1].

Theo de Raadt claims that virtualisation does not provide security benefits [2] based on the idea that the Xen hypervisor may have security related bugs.

From my understanding of Xen a successful exploit of a Xen system with a Dom0 that is strictly used for running the DomU’s would usually start by gaining local root on one of the DomU instances. From there it is possible to launch an attack on the Xen Dom0. One example of this is the recent Xen exploit (CVE-2007-4993) [3] where hostile data in a grub.conf in a DomU could be used to execute privileged commands in the Dom0. Another possibility would be to gain root access to a DomU and then exploit a bug in the Xen API to take over the hypervisor (I am not aware of an example of this being implemented). A final possibility is available when using QEMU code to provide virtual hardware where an attacker could exploit QEMU bugs, an example of this is CVE-2007-0998 where a local user in a guest VM could read arbitrary files in the host [4] – it’s not clear from the advisory what level of access is required to exploit it (DomU-user, DomU-root, or remote VNC access). VNC is different from other virtual hardware in that the sys-admin of the virtual machine (who might be untrusted) needs to access it. Virtual block devices etc are only accessed by the DomU and Xen manages the back-end.

The best reference in regard to these issues seems to be Tavis Ormandy’s paper about hostile virtualised environments [5]. Tavis found some vulnerabilities in the QEMU hardware emulation, and as QEMU code is used for a fully virtualised Xen installation it seems likely that Xen has some vulnerabilities in this regard. I think that it is generally recommended that for best security you don’t run fully virtualised systems.

The remote-console type management tools are another potential avenue of attack for virtualised servers in the case where multiple users run virtual machines on the same host (hardware). I don’t think that this is an inherent weakness of virtualisation systems. When security is most important you have one sys-admin running all virtual machines – which incidentally seems to be the case for most implementations of Xen at the moment (although for management not security reasons). In ISP hosting type environments I doubt that a remote console system based on managing Xen DomU’s is going to be inherently any less secure than a typical remote console system for managing multiple discrete computers or blades.

I have just scanned the Xen hypervisor source, the file include/asm-x86/hypercall.h has 18 entries for AMD64 and 17 for i386 while include/xen/hypercall.h has 18 entries. So it seems that there are 35 or 36 entry points to call the hypervisor, compared to 296 system calls on the i386 version of Linux (which includes the sys_socketcall system call which expands to many system calls). This seems to be one clear indication that the Linux kernel is inherently more complex (and therefore likely to have a higher incidence of security flaws) than the Xen hypervisor.

Theo’s main claim seems to be that Xen is written by people who aren’t OpenBSD developers and who therefore aren’t able to write secure code. While I don’t agree with his strong position I have to note the fact that OpenBSD seems to have a better security history than any other multi-user kernel for which data is available. But consider a system running Xen with Linux in Dom0 and multiple para-virtualised OpenBSD DomU’s. If the Linux Dom0 has OpenSSH as the only service being run then the risk of compromise would be from OpenSSH, IP based bugs in the Linux kernel (either through the IP address used for SSH connections or for routing/bridging to the OpenBSD instances), and from someone who has cracked root on one of the OpenBSD instances and is attacking the hypervisor directly.

Given that OpenSSH comes from the OpenBSD project it seems that the above scenario would only add the additional risk of an IP based Linux kernel attack. While a root compromise of an OpenBSD instance (consider that a typical OpenBSD system will run a lot of software that doesn’t come from the OpenBSD project – much of which won’t have a great security history) would only lose that instance unless the attacker can also exploit the hypervisor (which would be a much more difficult task than merely cracking some random daemon running as root that the sys-admin is forced to install). Is the benefit of having only one instance of OpenBSD cracked due to a bad daemon enough to outweigh the risk of a Linux IP stack?

I’m sure that the OpenBSD people would consider that a better option would be OpenBSD in the Dom0 and in the DomU. In which case the risk of damage from a root compromise due to one badly written daemon that didn’t come from OpenBSD is limited to a single DomU unless the attacker also compromises the hypervisor. When working as a sys-admin I have been forced by management to install some daemons as root which were great risks to the security of the system, if I had the ability to install them in separate DomU’s I would have been able to significantly improve the security of the system.

Another flaw in Theo’s position is that he seems to consider running a virtual machine as the replacement of multiple machines – which would be an obvious decrease in security. However in many cases the situation is that no more or less hardware is purchased, it is just used differently. If instead of a single server running several complex applications you have a Xen server running multiple DomU’s which each have a single application then things become much simpler and more secure. Upgrades can be performed on one DomU at a time which decreases the scope of failure (which often means that you only need one business unit to sign-off on the upgrade) and upgrades can be performed on an LVM snapshot (and rolled back with ease if they don’t succeed). A major problem with computer security is when managers fear problems caused by upgrades and prohibit their staff from applying security fixes. This combined with the fact that on a multiple DomU installation one application can be compromised without immediate loss of the others (which run in different DomU’s and require further effort by the attacker for a Xen compromise) provides a significant security benefit.

It would be nice for security if every application could run on separate hardware, but even with blades this is not economically viable – not even for the biggest companies.

I have converted several installations from a single overloaded and badly managed server to a Xen installation with multiple DomU’s. In all cases the DomU’s were easier to upgrade (and were upgraded more often) and the different applications and users were more isolated.

Finally there is the possibility of using virtualisation to monitor the integrity of the system, Bill Broadley’s presentation from the 2007 IT Security Symposium [6] provides some interesting ideas about what can be done. It seems that having a single OpenBSD DomU running under a hypervisor (maybe Xen audited by the OpenBSD people) with an OpenBSD Dom0 would offer some significant benefits over a single OpenBSD instance.

Senator Online

I’ve been asked for my opinion of senatoronline.org.au which claims to be Australia’s only internet-based political party. The claim may be correct depending on what you consider to be “Internet based“. Here is a copy of their platform from their web site:
Senator On-Line is not aligned to any other political party… it is neither Liberal nor Labor.
Senator On-Line (‘SOL’) is a truly democratic party which will allow everyone on the Australian Electoral roll who has access to the internet to vote on every Bill put to Parliament and have its Senators vote in accordance with a clear majority view.
We will be running candidates for the upcoming federal Upper House (Senate) elections.
When a SOL senator is elected a web site will be developed which will provide:

  • Accurate information and balanced argument on each Bill and important issues
  • The vast majority of those registered on the Australian Electoral roll the chance to have their say by voting on bills and issues facing our country
  • A tally of all votes which will then count in Parliament

Each person on the Australian Electoral roll will be entitled to one vote and only be allowed to vote once on each bill or issue.
SOL senators will have committed in writing to voting in line with the clear majority view of the SOL on-line voters.
Senator On-Line will enable broader community involvement in the political process and the shaping of our country.
If you like the concept, please register your details and tell others about SOL.

Now at first glance it sounds like a good idea, the Liberal party (which is similar in all the bad ways to the US Republican party) has demonstrated what happens when a party gets away with entirely ignoring the wishes of the voters.

But there are three significant problems with the Senator Online idea. The first is the issue of providing “Accurate information and balanced argument on each Bill“. We have seen many media outlets claiming that there is a debate about global warming (the debate ended years ago, the vast majority of scientists have agreed that global warming existed for a long time) and now the same media outlets are claiming that there is a debate about whether it will cause any harm to us in the near future (ignoring all the dams that are running low). One of the aims of the democratic process is that representatives who spend all their time working on politics can fully analyse the issues and also gain access to resources that are not available to the average citizen, thus being able to make better informed decisions. The next problem is that it can degenerate into mob-rule. The idea of having a tabloid TV show being able to directly control votes in the senate is not an exciting one. The final problem is how to ensure that each citizen gets the opportunity to have exactly one vote, solving this requires good security and identity checks involving viewing photo-id. The checks used for voting (merely asserting your name and residential address) might be considered adequate for an election, but are grossly inadequate for online voting where one false registration allows many false votes.

I think it would be more interesting to have a populist party started that campaigns for the issues that have the most impact on the majority of citizens. Issues such as the fact that a couple on the median income can’t afford the median house price [1], the lack of job security that is caused by recent industrial relations legislation, the price of health-care, and the fact that any car which is affordable on a single median income (after money is spent on rent and other basic living expenses) will lack basic safety features such as air-bags. While the Green party has decent policies on these issues they have many other policies to consider. A party that solely concentrated on allowing more people to have decent health care, not risk losing their job, own their own home, and drive a safe car would get a lot of interest from people who don’t normally take much notice of politics.

A Traditional Approach to an IT Career

I have just read Career Development for Geeks [1] by Erik de Castro Lopo [2]. It makes some interesting points about a traditional approach to an IT career. The path I followed for most of my career (after I had a few years experience) was to work as a contractor and happily leave jobs without having anything else lined up.

Erik suggests getting Engineers rather than managers to give references, it’s an interesting idea. Engineers can give better references about quality of work, while managers can speak on behalf of their employer (in theory). In practice a sane manager won’t give a bad review for legal reasons so the value of a reference from a manager is probably limited. Of course one problem with reviews is that I have never heard of a recruiting agent or employer actually verifying the ID of a reference. I could be listed on a friend’s CV as a senior manager in a multi-national company (which doesn’t have a major Australian presence) and give a good review and it seems unlikely that the typical recruiter would work it out.

For someone with plenty of spare time and no significant assets (no risk of being sued) it could be entertaining to apply for a bunch of positions that they are not qualified for with friends using pre-paid mobile phones to give references. This could be done as a documentary on corporate hiring practices, or to simply try and get the highest paid job possible. Based on observing some former colleagues it seems that little skill is required to get a job and that when people do really badly they get paid for at least a few months. I am constantly amazed when reading reports about so-called “con artists” who commit crimes for what are often small amounts of money. Getting an income significantly greater than average without knowing anything about how to do the work is very common and is never treated as fraud (the classic example was a former colleague who wanted to write his own encryption algorithm but who didn’t even know about binary and therefore couldn’t use operations such as XOR).

Erik’s main suggestion for dealing with recruiting agents is to talk about project management issues (recruiters don’t understand technology). My way of dealing with them has been to assure them that I know it all and tell them to forward my CV to the client.

Another suggestion is to learn new skills and diversify your skills. I don’t support this because I believe that the majority of people who read my blog are significantly more skillful than the typical programmer. If an area of technology starts to go out of fashion then it’s the people with the least skills who suffer the most. If you are good at your work and enjoy it then it shouldn’t matter much if people around you are being laid off. Of course to rely on this you have to be working in a reasonably large field. For example if you develop software in a language which has hundreds of programmers then you may be at risk, but if there are tens of thousands of programmers using the language then you only need to be in the most skillful 10% to be assured of employment for the next decade or two.

That said there are benefits to moving around, meeting new people, and working on new contracts. One thing you don’t want is to have 10 years of experience which are the same year repeated 10 times!

Update: Here is a paper Erik submitted to OSDC on the same topic [3]. Mostly the advice is the same but with more detail and in a better format for reading.

0wned a DVD Player

DVD player saying root

Above is a picture of a DVD player I saw on sale in Dick Smith Electronics [1] (a chain store that used to sell mostly electronics hobbyist gear but now mostly sells consumer electronics gear). I asked one of the staff why it said “root”, tests revealed that the DVD caused any player to display “root” once it was inserted. The DVD in question was from the $2 box (the DVDs that didn’t sell well at other stores) and for some reason had the string “root” in it’s title (or some other field that gets picked up by the player).

I wonder if an ex-employee of a movie company is laughing about millions of DVD players all around the world saying “root”.

Update: I’ve been told by private mail that “It means it’s displaying the “root” menu. (As opposed to the title menu or any submenu’s)” and “most just display ‘menu’ or similar“. So apparently every time my DVD player says “menu” the new ones from Dick Smith will say “root” (I have yet to test this theory).

Blog Ethics

Reporters Sans Frontiers (AKA RSF AKA Reporters Without Borders) has an interesting document about blogging [1]. They are specifically focussed on blogging as a way of reporting news. Their definition of a blog states that it is “a personal website” (there are many corporate blogs run by teams) and that it contains “mostly news” (most blogs that I read contain mostly technial information about computers and any “news” is mostly about the author). They also describe a blog as being set up with an “interactive tool” – most blogs are written using web-based tools, but some are written with plain text and a compilation system.

Some of their technical information is simply wrong. For example they say that RSS “alerts users whenever their favourite blogs are updated” (this can be done through RPC notification which then requires RSS to pull the data – but a user will almost always have a tool polling their favourite RSS feeds). Their section listing the various options for blogging platforms mentions LiveJournal, Blogger, and MSNSpaces but doesn’t mention WordPress.com which seems to be a serious omission (although it does mention civiblog.org which is a useful hosting resource for grassroots political campaigning). But these are minor problems, they are reporters primarily not programmers.

Their document gets interesting when it gives links to pages about blogging ethics. One page that was not directly linked (presumably because it mainly concerns non-journalistic blogging) is Rebecca Blood’s document about blogging ethics [2]. She gives a brief coverage to conflicts of interest and gives most space to the topic of maintaining a record of changes. One thing I have been considering is having a separate instance of WordPress for documents that change [3]. This way regular blog posts (such as this one) can be relied on to preserve a list of changes (any change other than correcting a typo will be documented) so if you link to a post you can rely on the original intent being preserved. But for posts which have ongoing value as live documents they will be kept current without a significant change-log. Items that go into the main blog would include commentary on news and predictions, and the document blog would contain technical information about Linux, the science-fiction stories that I will eventually write, etc. When I wrote my previous blog post about this issue I was mainly considering technical issues, but when considering the ethical issues it becomes clear that I need a separate blog (or other CMS) for such things. The site etbe.coker.com.au needs to be a reliable record of what I wrote while another site such as docs.coker.com.au could contain live documents with no such record. The essential factor here is the ability of the user to know what type of document they are seeing.

Another really interesting point that Rebecca makes is in regard to known bad sources. Sometimes a known bad source can produce worthy data and is worth referencing, but you have to note the context (which among other things means that if the worthy data gets any reasonable number of hits it may get replaced by something entirely different). If a blogger cites a reference on a known bad site and explains the reason for the reference (and the fact that the nature of the site is known) then the reader reaction will be quite different to just a reference with no explanation.

As a contrast to Rebecca Blood’s document the cyberjournalist.net code of ethics [4] covers the traditional issues of journalistic integrity, “public interest” etc. One interesting issue that they raise (but apparently don’t address) is the definition of “private people“. While it is generally agreed that movie stars and similar people deserve less protection as they have chosen to be in the public eye the situation regarding bloggers is less clear. Is a blogger with a high ranking on technorati.com in a similar position to a Hollywood star when it comes to privacy?

A claim is sometimes made that blogs are unreliable because they can change (based on the actions of some bloggers who change posts with no notification and turn sections of the blogsphere into a swamp). Magazines and newspapers are held as an example of unchanging content and people who publish online magazines sometimes express horror at the idea of changing a past issue. The fact is that newspapers and magazines never changed past issues because it is impossible to recall hundreds of thousands of printed documents that are distributed around a country or the world. When it’s possible to fix old issues (as it is online) then the requirement is to document the changes not keep bad versions.

It probably would be a useful feature for a blogging package to have the ability to display (and link to) old versions of a page. This would allow the users to easily see the changes and anyone who references a post can reference a particular version of it. In some situations it may make sense to use a Wiki server instead of a Blog server for some data to preserve the change history. Maybe I should consider a wiki in which only I have write access for my documents repository.

No-one seems to cover the issue of how to deal with fiction. Obviously for a blog such as 365tomorrows.com (a science-fiction blog) it’s assumed that all content is fictional unless stated otherwise. The problem comes when someone includes fiction amongst the non-fiction content of a blog. I recently unsubscribed from a blog feed when the author revealed that what appeared to have been a diary entry was actually a fiction story (and a poor one at that). If your blog has mostly non-fiction then any fiction must be clearly marked, probably including the word “fiction” in the title and the permalink would be required to avoid all potential of confusion. For my documents CMS/blog I am considering having a category of “fiction” and to use the category name in the permalink address of each post. Of course science-fiction is sometimes obvious as fiction, but to avoid problems with fiction that is set in current times I think it’s best to clearly mark all such posts. The old-fashioned media seems to have dealt with this, tabloid womens magazines (which almost everyone reads occasionally as they are often the only reading materiel in a waiting room) traditionally have a fiction story in every issue which is clearly marked.

Another type of blogging is corporate blogging. I wonder whether that needs to be covered separately in an ethics code.

One thing that the three documents about ethics have in common (as a side-note in each) is the fact that the personal reputation of the writer depends on their ethics. If you are known for writing truthfully and fairly then people will treat your posts more seriously than that of writers who lie or act unfairly. There are direct personal benefits in acting ethically. The RSF document claims that the sole purpose of ethics standards is to instill trust in the readership which directly benefits the writer – not for any abstract reasons. Whenever ethics is mentioned in terms of writing there is always someone who claims that it can’t be enforced, well many people in your audience will figure out whether you are acting ethically and decide whether they want to continue reading your materiel. I wonder whether Planet installations should have ethics codes and remove blogs that violate them.

In conclusion I think that a complete code of ethics for blogging needs to have some IF clauses to cover the different types of blog. I may have to write my own. Please let me know if you have any suggestions.