Archives

Categories

Unparliamentary Language and Free Software

I’ve just read the Wikipedia page about Unparliamentary Language [1]. I recommend that everyone read it, if only for the amusement value, among other things it links to incidents where elected representatives acted in a way that would be expected of primary school children. The general concept of having rules about Unparliamentary Language is that MPs are permitted to say anything in Parliament without the risk of being sued or prosecuted, but certain things are inappropriate – the most common example is directly accusing another MP of lying. One of the main aims of rules against Unparliamentary Language is to prevent attacks on the honor of another member.

Having just witnessed a mailing list discussion go widely off track when a free software project was denigrated, it seems to me that we could do with some similar guidelines for mailing list discussions. The aim would be not to just prevent excessive attacks on the honor of other members but to also protect the honor of the free software projects. So for example one might recommend not using a particular program because of design decisions which seem dubious or a bad security history, but saying “it’s crap” would be considered to be inappropriate. Not that rejecting a program based on design decisions or a history of security flaws would be uncontroversial, but at least that gives objective issues to discuss so if there is a debate it will educate some of the lurkers.

Note that I’m not claiming to be better than other people in this regard, I’ve described software as crap on more than a few occasions. But I will try to avoid such things in future.

Finally does anyone have a good suggestion for a Free Software equivalent to the term “Unparliamentary Language”? It seems that to a large extent the support of certain ideas depends on having a catchy name and I can’t think of one.

Communication Shutdown and Autism

The AEIOU Foundation

The AEIOU Foundation [1] is a support and advocacy organisation for people on the Autism Spectrum, note that they clearly say Autism Spectrum Disorder (ASD) on their About page, some of what they write would be less wrong if it was claimed to apply to only non-verbal Autistics or people claimed to be Low Functioning Autistic (LFA). But in regard to the Autism Spectrum they just don’t seem to know much about it, a lot of their web pages seem to be based on the assumption that anyone who is on the Spectrum will be lucky if they can ever live independently. However it seems that most people who can be diagnosed with an ASD have typical social skills by the standards of the IT industry and can get by without any special assistance. The entire site seems to be written about people on the Spectrum by people who know little of their experiences and contains hardly any information that matches what I’ve read from various people on the Spectrum (of course there are a wide range of experiences that differ greatly).

They have a link to “Autism Related Sites” which starts with “Autism Speaks” (the Wikipedia page about Autism Speaks is worth reading – note the section about immunisation research which has been repeatedly debunked and the section about legal action against a young autistic blogger). There are many good reasons why Autism Speaks is so widely hated among people on the Spectrum. I think that recommending Autism Speaks is a sign of willful ignorance of almost everything related to Autism.

In their page about describing Autism to NTs they say “Imagine if you suddenly woke up in a foreign country, did not speak the language and had no way of effectively communicating with the people around you“. I’ve been diagnosed with Asperger Syndrome by a psychologist who considers that to be the same as High Functioning Autism (HFA) and I’ve visited more than a few countries. I find the comparison of the Autistic experience to visiting another country to be so strange that I don’t even know where I would begin if I was to comment on where it went wrong.

Finally they have a scrolling bar listing their advertisers at the bottom of ever single page on their site. If someone was going to design a web site specifically to annoy people on the Spectrum then such a scrolling banner would be a good place to start.

Now they probably do some good things to help families with children on the Spectrum. But their ability to do good is really hindered by the lack of input from people on the spectrum, Rachel Cohen-Rottenberg (leader of the Vermont Chapter of the Autistic Self Advocacy Network) wrote an interesting post about AEIOU and noted that none of the people who run AEIOU are Autistic [2].

The Communication Shutdown

Someone got the idea that Neuro-Typical people (NTs) should try and understand what it’s like to be on the Spectrum – which is a reasonable idea. But they decided that the way to do so is to have them refrain from Internet based socialisation and not use Facebook and Twitter for one day. It seems to me that most people on the Spectrum primarily socialise via the Internet, so ceasing Internet based socialisation is likely to make their experience less like that of people on the Spectrum. I’m getting a mental image of a bunch of NTs deciding to go to a night-club for their Internet free evening and then imagining that they are somehow empathising with the experience of people who can never enjoy a night-club.

As an aside, a web site which has anything at all related to disabilities shouldn’t rely on Flash – the Communication Shutdown site totally fails in this regard.

No Stereotypes Here has an interesting analysis of this situation, among other things they comment on the irony of having someone ask them to stop using twitter as part of this campaign [3]. One thing that they suggest is for NTs to have a day without any communication at all.

Some Suggestions for People who Want to Understand

As a communication exercise, try going shopping without speaking, just use hand gestures. For bonus points try doing so in a foreign country where you don’t know the language so you need bidirectional non-verbal communication – with some luck you can complete a transaction without the shopkeeper realising that you are a foreigner. This won’t actually give you much of the Autistic experience, but it’s a good exercise in understanding how communication works.

Someone who wanted to know the down-side of being on the Spectrum could find a sports bar where most patrons support one team and then enter the bar while wearing a jersey indicating support for an opposing team. I don’t recommend doing this because it really wouldn’t be fun, but for a quick approximation of the experience it would probably work well.

It seems to me that paying $5 to a charity and then boasting about doing so on your Facebook page for a day is an easy thing to do. A harder task would be to spend a day reading about the experiences of the people in question and then giving $5 to a charity that is well regarded by the target group.

Another possible way of gaining some understanding would be to have a party where everyone brings their laptop and uses only electronic communication – no speaking at all. This is in fact fairly close to what some of the Geekier (possibly Autistic) members of the IT community do.

Parkour in Melbourne

When I was walking past Southbank when I saw some Parkour being practiced. I watched for a while and spoke to the instructor after the informal lesson was finished. He’s a professional instructor with the Melbourne branch of the Australian Parkour Association [1] and he sometimes gives free advice to newbies that he meets on the street (in this case a group of 6 teenage boys).

From the web site it seems that the standard lesson fee is $15 for an indoor lesson or $10 for an outdoor lesson – with a $5 discount for members of the Australian Parkour Association [2], which is really cheap for a 2 hour lesson! APA membership costs $10 to join plus a $50 annual membership fee.

It’s worth reading the ParkourPedia information about the “spirit/philosophy” that is Parkour [3]. It’s interesting to note that there can be no official Parkour shows because if you do it for anyone else then it’s not Parkour – so much for all the Parkour videos on Youtube. Also another issue with the Youtube videos is that Parkour isn’t about doing the most dangerous things you can possibly survive in an urban environment, it can be practiced in country areas and isn’t supposed to be unreasonably dangerous.

The outdoor Parkour lessons start near the Arts center in the middle of Melbourne and presumably some of them go past Southbank as it has some interesting things to jump over. So it’s worth watching out for people jumping over various obstacles instead of walking around them. They may not be doing a Parkour show, but it’s in public and anyone can watch.

Links October 2010

Bruce Schneier wrote an insightful post about why designing products for wiretapping is a bad idea [1]. It seems that large parts of the Internet will be easy to tap (for both governments and criminals) in the near future unless something is done. The bad results of criminal use will outweigh any benefits of government use.

Sam Watkins wrote an informative post about Android security [2]. Among other things any application can read all stored data including all photos, that’s got to be a problem for anyone who photographs themself naked…

Rebecca Saxe gave an interesting TED talk about how brains make moral judgements [3]. Somehow she managed to speak about the Theory of Mind without mentioning Autism once.

The Guardian has an amusing article by Cory Doctorow about security policies in banks [4]. He advocates promoting statistical literacy (or at least not promoting a lack of it) as a sound government policy. He also suggests allowing regulators to fine banks that get it wrong.

Steven Johnson gave an interesting TED talk about Where Good Ideas Come From [5]. It’s a bit slow at the start but gets good at the end.

Adam Grosser gave an interesting TED talk about a fridge that was designed for use in Africa [6]. The core of the Absorption Refrigerator is designed to be heated in a pot of water in a cooking fire and it can then keep food cool for 12 hours. It’s a pity that they couldn’t design it to work on solar power to avoid the fuel use for the cooking fire.

Josh Silver gave an interesting TED talk about liquid filled spectacles [7]. The glasses are shipped with a syringe filled with liquid at each side that is used to inflate the lenses to the desired refractive index. The wearer can just adjust the syringes until they get to the right magnification, as there are separate syringes the glasses work well with people who’s eyes aren’t identical (which is most people). Once the syringes are at the right spots the user can tighten some screws to prevent further transfer of liquid and cut the syringes off – to give glasses that aren’t overly heavy but which can’t be adjusted any more, I guess that a natural extension to this would be to allow the syringes to be re-attached so that the user could adjust them every year to match declining vision. One thing that this wouldn’t do is counter for Astigmatism (where the lens of the eye doesn’t focus light to a point), but I guess they could make lenses to deal with a few common varieties of Astigmatism so that most people who have that problem can get a reasonable approximation. The current best effort is to make the glasses cost $19, which is 19 days salary for some of the poorest people in the world. Glasses in Australia cost up to $650 for a pair (or a more common cost of $200 or about $100 after Medicare) which would be about one day’s salary.

Eben Bayer gave an inspiring TED talk about one of the ways that mushrooms can save the planet [8]. He has designed molds that can be filled with Pasteurised organic waste (seed husks etc) and then seeded with fungal spores. The fungus then grows mycelium (thin fungal root fibers) through the organic waste making it into a solid structure which fits the shape of the mold. This is currently being used to replace poly-styrene foam for packaging and can apparently be used for making tiles that are fire retardant and sound proof for constructing buildings. The main benefits of the material are that it can be cheaply made without petrochemicals and that it is bio-degradable, I’m not sure how the bio-degradable part would work with constructing buildings – maybe they would just replace the panels every few years.

Annie Lennox gave a TED talk about her Sing foundation to care for women and children who are affected by AIDS [9]. She describes the effect of AIDS in Africa as Genocide.

Robert Ballard gave a very informative TED talk about exploring the oceans [10]. This was one of the most informative TED talks I’ve seen and Robert is also one of the most enthusiastic speakers I’ve seen, it’s really worth watching! We really need more money spent on exploring the oceans.

Jessa Gamble gave an interesting TED talk which suggests that the best thing to do is to go to bed at about sunset and then have a couple of hours of relaxing time during the middle of the night [11]. Apparently the subjects of body-block experiments who live for a month in a bunker without natural light or access to a clock get better sleep in this manner than they ever had in their life and feel fully awake for the first time.

World Changing is a blog that has a lot of interesting articles about climate change and related issues [12]. It’s worth a read.

Cynthia Schneider gave an interesting TED talk about how reality competition TV is affecting reality [13]. Shows that are derived from the American Idol concept are driving a resurgence in some traditional forms of performance art while also promoting equality – among other things it’s apparent that winning is more important than misogyny.

The Ritual of the Calling of an Engineer is an interesting concept [14]. I think it would be good to have something similar for Computer Science.

Benjamin Mako Hill wrote an interesting and insightful essay about Piracy and Free Software [15].

Web Video, Global Innovation, and Free Software

Web Video and Global Innovation

Chris Anderson (the curator of TED) gave an insightful TED talk about Web Video and Global Innovation [1]. Probably most people who have used the Internet seriously have an intuitive knowledge of the basic points of this talk, Chris had the insight to package it together in a clear manner.

He describes how the printing press decreased the importance of verbal communication skills and services such as Youtube have caused a resurgence in the popularity and importance of speeches. He has some interesting theories on how this can be leveraged to improve education and society.

Lectures for Developers vs Users

Now how can we use these principles to advance the development of Free Software?

It seems to me that a good lecture about Free Software achieve will achieve some of the following goals:

  1. Promoting projects to new developers.
  2. Teaching developers some new aspects of software development related to the system.
  3. Promoting projects to new users.
  4. Teaching users (and prospective users) how to use the software.

The talks aimed at developers need to be given by technical experts, but talks aimed at users don’t need to be given by experts on the technology – and someone who has less knowledge of the software but better public speaking skills could probably do a better job when speaking to users. Would it do some good to encourage people to join Free Software projects for the purpose of teaching users? It seems that there are already some people doing such work, but there seems little evidence of people being actively recruited for such work – which is a stark contrast to the effort that is sometimes put in to recruiting developers.

One problem in regard to separating the user-training and developer-training parts of Free Software advocacy and education is that most conferences seem to appeal to developers and the more Geeky users. Talks for such conferences tend to be given by developers but the audience is a mix of developers and users. Would it be better to have streams in conferences for developers and users with different requirements for getting a talk accepted for each stream?

Publishing Videos

It has become a standard feature of Free Software related conferences to release videos of all the talks so anyone anywhere in the world can watch them, but it seems that this isn’t used as much as we would like. The incidence of Free Software developers citing TED talks in blog posts appears to exceed the incidence of them citing lectures by their peers, while TED talks are world leading in terms of presentation quality the talks by peers are more relevant to the typical Free Software developer who blogs. This seems to be an indication that there is a problem in getting the videos of talks to the audience.

Would it help this to make it a standard feature to allow comments (and comments that are rated by other readers) on every video? Would having a central repository (or multiple repositories) of links to Free Software related talks help?

Would it help to have a service such as Youtube or Blip.tv used as a separate repository for such talks? Instead of having each conference just use it’s own servers if multiple conferences uploaded talks to Youtube (or one of it’s competitors) then users could search for relevant talks (including conference content and videos made by individuals not associated with conferences). What about “video replies”?

What if after each conference there was an RSS feed of links to videos that had one video featured per day in a similar manner to the way TED dribbles the talks out. If you publish 40 videos of 45 minute lectures in one week you can be sure that almost no-one will watch them all and very few people will watch even half of them. But if you had an RSS feed that gave a summary of one talk per day for 6 weeks then maybe many people would watch half of them.

Defining Success

Chris cites as an example of the success of online video the competition by amateur dancers to create videos of their work and the way that this was used in selecting dancers for The LXD (Legion of eXtraordinary Dancers) [2]. I think that we need a similar culture in our community. Apart from people who give lectures at conferences and some of the larger user group meetings there are very few people giving public video talks related to Free Software. There is also a great lack of instructional videos.

This is something that anyone could start doing at home, the basic video mixing that you need can be done with ffmpeg (it’s not very good for that purpose, but for short videos it’s probably adequate) and Istanbul is good for making videos of X sessions. If we had hundreds of Free Software users making videos of what they were doing then I’m sure that the quality would increase rapidly. I expect that some people who made such videos would find themselves invited to speak at major conferences – even if they hadn’t previously considered themself capable of doing so (the major conferences can be a bit intimidating).

How do we Start?

Publishing videos requires some significant bandwidth, a cheap VPS has a bandwidth quota of 200GB per month, if short videos are used with an average size of 30MB (which seems about typical for Youtube videos) then that allows more than 6000 video views per month – which is OK but as my blog averages about 2000 visits per day (according to Webalizer) it seems that 6000 views per month isn’t enough for any serious vlogging. Not to mention the fact that videos in higher resolution or a sudden spike in popularity can drive the usage a lot higher.

It seems that a site like Youtube or blip.tv is necessary, which one is best?

There are lots of things that can be changed along the way, but a hosting service is difficult to change when people link to it.

Conclusion

I don’t claim to have many answers to these questions. I’m planning to start vlogging soon so I will probably learn along the way.

I would appreciate any suggestions. Also if anyone has a long suggestion then a blog post will be best (I’ll link to any posts that reference this one). If anyone has a long suggestion that is worthy of a blog post but they don’t have a blog then I would be happy to post it on my blog.

Choosing an Android Phone

My phone contract ends in a few months, so I’m looking at getting a new Android phone. I want a big Android phone (in both physical size and resolution) that has a physical keyboard, a digital compass, A-GPS and at least a 5MP camera with geo-tagging.

I want to be able to read PDF files and run ssh sessions, so a big screen is required and a physical keyboard avoids wasting screen space for a soft-keyboard. My pockets will fit something about 10.5cm wide by 17cm high but I don’t expect anyone to manufacture such a large phone. High resolution is a good thing too, it seems that the best available at the moment is 854*480 (with 800*480 being reasonably common).

I want Wifi and all the 3G and GSM data transfer standards. It would be ideal to have a phone with the dual networking stack needed to do both voice and data at the same time.

I’m not interested in anything that runs a version of Android older than 2.2 as native tethering is important. An option to upgrade to post 2.2 would be a really good thing.

Here are the nearest options I could find:

Phone Resolution Screen Size (inches) Camera Resolution Notes
Motorola Milestone 854*480 3.7 5MP
Motorola Droid 854*480 3.7 5MP
LG VS 740 800*480 3.2 3.2MP no GPS or compass
Lenovo LePhone 800*480 3.7 3MP no GPS or compass

It seems that Motorola makes the phones that best suit my needs, does anyone know of any better options?

Open Source Learning

Richard Baraniuk gave an interesting TED talk about Open Source Learning [1]. His project named Connexions which is dedicated to the purpose of creating Creative Commons free textbooks is a leader in this space [2].

He spoke about Catherine Schmidt-Jones who wrote 197 modules and 12 courses on music [3], that’s a very significant amount of work!

He also mentioned the translation of the work into other languages. I wonder how well the changes get merged back across the language divide. We have ongoing disputes in the free software community about whether various organisations do enough work to send patches back upstream, this seems likely to be more of a problem in situations where most of the upstream authors can’t even understand the language in which the changes are written and when the changes involve something a lot more subtle than an change to an algorithm. This would be particularly difficult for Chinese and Japanese as those languages seem to lack quality automatic translation.

He mentioned Teachers Without Borders [4] in passing. Obviously an organisation that wants to bring education to some of the poorer parts of the world can’t have a curriculum that involves $250 of text books per year for a high school student (which was about what my parents paid when I was in year 12) or $500 of text books per year for a university student (which might be a low estimate for some courses as a single text can cost more than $120). Free content and on-demand printing (or viewing PDF files on a OLPC system) can dramatically lower the cost of education.

It’s widely believed that free content without the ability to remix is cultural imperialism. This is apparently one of the reasons that the connexions project is based on the Creative Commons Attribution license [5]. So anyone anywhere can translate it, make a derivative work, or collate parts of it with other work. I expect that another factor is the great lack of success of all the various schemes that involve people contributing content for a share of the profits, the profits just don’t match the amount of work involved. Philanthropy and reputation seem to be the only suitable motivating factors for contributing to such projects.

One of the stated benefits of the project is to have computer based content with live examples of equations. Sometimes it is possible to just look at an equation and know what it means, but often more explanation is required. The ability to click on an equation, plug in different values and have them automatically calculated and possibly graphed if appropriate can make things a lot easier. Even if the result is merely what would be provided by reading a text book and spending a few minutes with a scientific calculator the result should be a lot better in terms of learning as the time required to operate a calculator can break the student’s concentration. Even better it’s possible to have dynamic explanations tailored to the user’s demand. To try this out I searched on Ohm’s Law (something that seems to be unknown by many people on the Internet who claim to understand electricity). I was directed to an off-site page which used Flash to display a tutorial on Ohm’s Law, the tutorial was quite good but it does seem to depart from the free content mission of the project to direct people off-site to proprietary content which uses a proprietary delivery system. I think that the Connexions project could do without links to sites such as college-cram.com.

One of the most important features of the project is peer review “lenses“. The High Performance Computing Lens [6] has some good content and will be of interest to many people in the free software community – but again it requires Flash.

One final nit is the search engine which is slow and not very useful. A search for “engine” returned lots of hits about “engineering” which isn’t useful if you want to learn about how engines work. But generally this is a great project, it seems to be doing a lot of good and it’s got enough content to encourage other people and organisations to get involved. It would be good to get some text books about free software on there!

Links September 2010

Kevin Stone gave an interesting TED talk about the biological future of joint replacement [1]. Using stem cells and animal tissue which has been prepared to destroy the chemicals that trigger immune responses the tissues can regrow. Replacing joints with titanium and ceramic lets people walk again, regrowing them with Kevin’s methods allows them to compete at the highest levels of sporting contests!

Derek Sivers gave a brief and interesting TED talk advising people to keep their goals secret if they want to achieve them [2].

The Parrot AR.Drone is an interesting device [3], it’s a four-propeller helicopter that is controlled by WiFi. Apparently there is little security (it binds to the first WiFi client it sees) which is a significant down-side. It returns a video feed to the controlling iPhone as it flys and can hover when it loses it’s connection. It will be interesting to see when people write software for other devices (Android etc). Also I wonder whether there will be open source weapons kits for it. If you could have those devices use either a nerf gun or a lance to attack each other’s turbines then you could have drone jousting.

Don Marti had an interesting new idea for a crime game [4]. The main concept is to use the element of mistrust that is present in real criminal gangs. The new guy you invite to join a job might inform the police and you won’t know for sure. Sometimes a heist will be discovered by the police through bad luck (or good police work) and you will wonder whether there was an informant. The aim is for a simple game design and with the complexity in email discussions between the players.

The C64 isn’t dead, it’s even on the Internet [5], an original C64 is running a web site!

Tan Li gave an interesting TED talk about a new headset to read brain-waves [6]. The device in question can be applied in minutes, requires no gel or abrasion of the scalp, connects to the computer wirelessly and is relatively cheap at $300US. The developer’s kit (which I think includes a headset) is $500US. I wonder if the community can develop a cheaper version of this which is more open.

Lisa Margonelli gave an interesting TED talk about the politics of oil [7]. One of her insightful points is that the subsidies for oil should be shifted from the oil industry to middle-class consumers. But she goes off track a bit by suggesting gradual oil tax increases until 2020, according to the best estimates of groups such as the CSIRO they won’t need to have taxes to give a high oil price in 2020! She is aiming for a 20% reduction in petrol use by 2020, but I’m not aware of any serious group of scientists who have evidence to suggest that the production capacity in 2020 will be close to 80% what it is now.

Slate has a good article about The Innocence Project which uses DNA tests to overturn false convictions [8], it’s scarey how badly the justice system works.

Rachel Sussman gave an interesting talk about the World’s Oldest Living Things [9], nothing less than 2000 years old is included.

Nicholas Negroponte gave an interesting EG 2007 talk about the OLPC project [10]. While some of the content about OLPC production is a little dated the OLPC history is relevant and his personal opinions regarding the benefits that children receive from computers are insightful.

Jayne Poynter gave an interesting TED talk about life in Biosphere 2 [11]. Her work on mini-biospheres is also interesting. Let’s hope we get a permanent Biosphere running outside the Earth sometime soon.

Sheryl WuDunn gave an informative TED talk titled “Our Century’s Greatest Injustice” about the plight of women in most of the world [12].

Daniel Kahn Gillmor wrote a good article about the use of ssh-agent [13]. You really should use it to secure your ssh keys.

Mark Shuttleworth has described the development process for the Ubuntu font [14]. This is a very interesting project and IMHO one of the most significant things that Ubuntu has done. Prior to this an entirely closed development process has been used for font development. Now we are getting a major new font family developed with a free and open process and with some new legal developments for a font license! One thing to note that this project appears to have involved a lot of work from professional font designers, it sounds like Canonical spent a significant amount of money on this project.

Is Pre-Forking any Good?

Many Unix daemons use a technique known as “pre-forking”. This means that to save the amount of time taken to fork a child process they will keep a pool of processes waiting for work to come in. When a job arrives then one of the existing processes is used and the overhead of the fork() system call is saved. I decided to write a little benchmark to see how much overhead a fork() really has. I wrote the below program (which is released under the GPL 3.0 license) to test this. It gives the performance of a fork() operation followed by a waitpid() operation in fork()s per second and also the performance of running a trivial program via system which uses /bin/sh to execute the given command.

On my Thinkpad T61 with a Intel(R) Core(TM)2 Duo CPU T7500 @ 2.20GHz I could get 2429.85 forks per second when running Linux 2.6.32 in 64bit mode. On a Thinkpad T20 with a 500MHz P3 CPU I could get 341.74 forks per second. In both cases is seems that the number of forks per second is significantly greater than the number of real-world requests. If each request on average took one disk seek then neither system would have the fork performance as any sort of bottleneck. Also if each request took more than a couple of milliseconds of CPU time on the T7500 or 10ms of CPU time on the 500MHz P3 then the benefits of pre-forking would be very small. Finally it’s worth noting that the overhead of fork() + waitpid() in a loop will not be the same as the overhead of just fork()ing off processes and calling waitpid() when there’s nothing else to do.

I had a brief look at some of my servers to see how many operations they perform. One busy front-end mail server has about 3,000,000 log entries in mail.log per day, that is about 35 per second. These log entries include calling SpamAssassin and Clamav, which are fairly heavy operations. The system in question averages one Intel(R) Xeon(R) CPU L5420 @ 2.50GHz core being used 24*7, I can’t do a good benchmark run on that system as it’s always busy but I think it’s reasonable to assume for the sake of discussion that it’s about the same speed as the T7500 (it may be 5* faster, but that won’t change things much). At 2429 forks per second (or 0.4ms per fork/wait) if that time is entirely reduced to zero that won’t make any noticeable difference to a system that has an average operation taking 1000/35= 28ms!

Now if a daemon was to use fork() + system() to launch a child process (which is a really slow way of doing it) then the T7500 gets 248.51 fork()+system() operations per second with bash and 305.63 per second with dash. The P3-500 gets 24.48 with bash and 33.06 with dash.

So it seems that if every log entry on my busy mail server involved using a fork()+system() operation and it was replaced to use pre-forked daemons then it might be possible to save almost 10% of the CPU time on that system in question.

Now it is theoretically possible that the setup of a daemon process can take more CPU time than fork()+system(). EG a daemon could have some really complex data structures to initialise. If the structures in question were initialised in the same way for each request then a viable design would be to have the master process initialise all the data which would then be inherited by the children. The only way I can imagine for a daemon child process to take any significant amount of time on modern hardware is for it to generate a session encryption key, and there’s really nothing stopping a single master process from generating several such keys in advance and then passing them to child processes as needed.

In conclusion I think that the meme about pre-forking is based on hardware that was used at a time when a 500MHz 32bit system (like my ancient Thinkpad T20) was unimaginably fast and when operating systems were less efficient than a modern Linux kernel. The only corner case might be daemons which do relatively simple CPU bound operations – such as serving static files from a web server where the data all fits into the system cache, but even then I expect that the benefit is a lot smaller than most people think and the number of pre-forked processes is probably best kept very low.

One final thing to note is that if you compare fork()+exec() with an operation to instruct a running daemon (via Unix domain sockets perhaps) to provide access to a new child (which may be pre-forked or may be forked on demand) then you have the potential to save a moderate amount of CPU time. The initialisation of a new process has some overhead that is greater than calling fork(), and when you fork() a new process there are usually lots of data structures which are not written after that time which means that on Linux they remain as shared memory and thus reduce the system memory use (and improve cache efficiency when they are read).

#include <unistd.h>
#include <stdio.h>
#include <sys/time.h>
#include <sys/types.h>
#include <sys/wait.h>
#include <stdlib.h>

#define NUM_FORKS 10000
#define NUM_SHELLS 1000

int main()
{
  struct timeval start, end;
  if(gettimeofday(&start, NULL) == -1)
  {
    fprintf(stderr, "Can't get time of day\n");
    return 1;
  }

  int i = 0;
  while(i < NUM_FORKS)
  {
    pid_t pid = fork();
    if(pid == 0)
      return 0;
    if(pid > 0)
    {
      int status;
      pid_t rc = waitpid(-1, &status, 0);
      if(rc != pid)
      {
        fprintf(stderr, "waidpid() failed\n");
        return 1;
      }
    }
    else
    {
      fprintf(stderr, "fork() failed\n");
      return 1;
    }
    i++;
  }

  if(gettimeofday(&end, NULL) == -1)
  {
    fprintf(stderr, "Can't get time of day\n");
    return 1;
  }

  printf("%.2f fork()s per second\n", double(NUM_FORKS)/(double(end.tv_sec – start.tv_sec) + double(end.tv_usec – start.tv_usec) / 1000000.0) );

  if(gettimeofday(&start, NULL) == -1)
  {
    fprintf(stderr, "Can't get time of day\n");
    return 1;
  }

  i = 0;
  while(i < NUM_SHELLS)
  {
    pid_t pid = fork();
    if(pid == 0)
    {
      if(system("id > /dev/null") == -1)
        fprintf(stderr, "system() failed\n");
      return 0;
    }
    if(pid > 0)
    {
      int status;
      pid_t rc = waitpid(-1, &status, 0);
      if(rc != pid)
      {
        fprintf(stderr, "waidpid() failed\n");
        return 1;
      }
    }
    else
    {
      fprintf(stderr, "fork() failed\n");
      return 1;
    }
    i++;
  }

  if(gettimeofday(&end, NULL) == -1)
  {
    fprintf(stderr, "Can't get time of day\n");
    return 1;
  }

  printf("%.2f fork() and system() calls per second\n", double(NUM_SHELLS)/(double(end.tv_sec – start.tv_sec) + double(end.tv_usec – start.tv_usec) / 1000000.0) );
  return 0;
}

My Squeeze SE Linux Repository

deb http://www.coker.com.au squeeze selinux

I have an Apt repository for Squeeze SE Linux packages at the above URL. Currently it contains a modified version of ffmpeg that doesn’t need execmod access on i386 and fixes the labeling of /dev/xen on systems that use devtmpfs as reported in bug #597403. I will keep updating this repository for any SE Linux related bugs that won’t get fixed in Squeeze.

Is there any interest in architectures other than i386 and AMD64?