Archives

Categories

LUV Meeting July 2008

At the last two meetings of LUV [1] I’ve given away old hardware. This month I gave away a bunch of old PCI and AGP video cards, a heap of PC power cables, and some magnets (which I received for free because they were in defective toys that could seriously injure or kill children). One new member was particularly happy that at the first meeting he attended he received some free hardware (I hope it works – most of that stuff hasn’t been tested for over a year and I expect that some would fail). Also there was another guy giving away hardware, so I might have started a trend of giving away unused hardware at meetings (he was giving away some new stuff in the original boxes, mostly USB and firewire cables).

For a long time (many years) at LUV meetings there have been free text books given away. One member reviews books and then gives them away after he has read them.

At the meeting Ralph Becket gave a presentation on the Mercury functional language. It was interesting to note that Mercury can give performance that is close to C (within 80%) on LZW compression (which is apparently used as a benchmark for comparing languages). Given the number of reasonably popular languages which don’t give nearly that level of performance I think that this is quite a good result.

After the meeting Richard Keech demonstrated his electric car. It’s a Hyundai Getz which has had the engine replaced by an electric motor but which still uses the manual gearbox. Richard did a bit of driving around with various LUV members as passengers to demonstrate what the car can do. Unfortunately I didn’t get a chance to be involved in that, so I’ll have to do so next time I meet him. One thing to note is that Richard’s car was not built that way by Hyundai, it was a custom conversion job. The down-side to this of course is that it would have cost significantly more than a vehicle with the same technology that was manufactured. One design trade-off is that Richard had batteries installed in the place for a spare tire. Last year the RACV magazine published a letter I wrote suggesting that small cars should be designed without a spare tire and that owners of such cars should rely on the RACV to support them if they get a flat tire [2], my option has not changed in the last year, I still think that cars which are driven in urban areas don’t really need spare tires so I don’t think that Richard is losing anything in this regard.

The motor driving Richard’s car runs on three-phase AC and a solid-state inverter is used to convert 185V DC to about the same voltage at three phase AC (I didn’t write notes so I’m running from memory). Apparently on long drives the inverter gets cooler rather than hotter – I had expected that there would be enough inefficiency in the process of converting DC to AC that it would get hot.

In a previous conversation Richard told me that he can drive his car 75Km on one charge and that it takes him 8 hours to charge when using an Australian mains (240V) plug rated at 10A. When designing such a vehicle it would be trivial to make it use a 20A plug for a 4 hour charge or even a two-phase plug for even shorter charging (I’m sure that Richard could have requested these options if he wanted them). But an 8 hour charge allows the vehicle to be completely charged during a working day and the use of the most common type of plug (the type used in every home and office) means that it can be charged almost anywhere (the standard mains circuit used in Australia is rated at 15A so special wiring is needed for a 20A socket). There is such a power point mounted on the outside of my house not too far from where a visitor could park their car. I anticipate that in a few years time it will not be uncommon for people who visit me to charge their car during their visit. Richard’s ratio of an hour of charge to almost 10Km of driving means that someone who visits for dinner could get enough charge into their car to allow for 30Km of driving before they leave. 30Km is about the driving distance to go from my house to a location on the other side of the city that is just outside the main urban area, so probably at least half of Melbourne’s population lives within a 30Km driving distance from my house. Not that I expect friends to arrive at my house with their car battery almost flat, but it does make it easier to plan a journey if you know that at point A you will be able to get enough charge to get you to point B.

I think it’s a good thing to have members of LUGs give things away to other people and to demonstrate technology that is of wide interest. I hope to see more of it.

The History of MS

Jeff Bailey writes about the last 26 years of Microsoft [1]. He gives Microsoft credit for “saving us from the TRS 80”, however CP/M-86 was also an option for the OS on the IBM PC [2]. If MS hadn’t produced MS-DOS for a lower price then CP/M would have been used (in those days CP/M and MS-DOS had the same features and essentially the same design). He notes the use of the Terminate and Stay Resident (TSR) [3] programs. As far as I recall the TSR operation was undocumented and was discovered by disassembling DOS (something that the modern MS EULAs forbid).

Intel designed the 8086 and 80286 CPUs to permit code written for an 8086 to run unchanged in “protected mode” on an 80286 (as noted in the Wikipedia page about the 80286 [4]). Basically all that you needed to do to write a DOS program with the potential of being run directly in protected mode (or easily ported) was to allocate memory by requesting it from the OS (not just assuming that every address above your heap was available to write on) and by addressing memory only by the segment register returned from the OS when allocating memory (IE not assuming that incrementing a segment register is equivalent to adding 16 to the offset). There were some programs written in such a manner which could run on both DOS and text-mode OS/2 (both 1.x and 2.x), I believe that such programs were linked differently. The term Fat Binary [5] is often used to refer to an executable which has binary code for multiple CPUs (EG PPC and M68K CPUs on the Macintosh), I believe that a similar concept was used for DOS / OS/2 programs but the main code of the application was shared. Also compilers which produce object code which doesn’t do nasty things could have their object code linked to run in protected mode. Some people produced a set of libraries that allowed linking Borland Turbo Pascal code to run as OS/2 16bit text-mode applications.

The fact that OS/2 (the protected-mode preemptively multi-tasking DOS) didn’t succeed in the market was largely due to MS. I never used Windows/386 (a version of Windows 2.x) but used Windows 3.0 a lot. Windows 3.0 ran in three modes, “Real Mode” (8086), “Standard Mode” (80286), and “Enhanced Mode” (80386). Real Mode was used for 8086 and 8088 CPUs, for 80286 systems if you needed to run one DOS program (there was no memory for running more than one), and for creating or adjusting the swap-file size for an 80386 system (if your 80386 system didn’t have enough swap you had to shut everything down, start Real Mode, adjust the swap file, and then start it again in Enhanced Mode). Standard Mode was the best mode for running Windows programs (apart from the badly written ones which only ran on Real Mode), but due to the bad practices implemented by almost everyone who wrote DOS programs MS didn’t even try to run DOS programs in 286 protected mode and thus Standard Mode didn’t support DOS programs. Enhanced Mode allowed multitasking DOS programs but as hardly anyone had an 80386 class system at that time it didn’t get much use.

It was just before the release of Windows 3.1 that I decided to never again use Windows unless I was paid to do so. I was at a MS presentation about Windows 3.1 and after the marketing stuff they had a technical Q/A session. The questions were generally about how to work around bugs in MS software (mainly Windows 3.0) and the MS people had a very detailed list of work-arounds. Someone asked “why don’t you just fix those bugs” and we were told “it’s easier to teach you how to work around them than to fix them“. I left the presentation before it finished, went straight home and deleted Windows from my computer. I am not going to use software written by people with such a poor attitude if given a choice.

After that I ran the DOS multi-tasker DesqView [6] until OS/2 2.0 was released. Desqview allowed multitasking well written DOS programs in real mode, Quarterdeck was the first company to discover that almost 64K of address space could be used above the 1MB boundary from real-mode on a 80286 (a significant benefit when you were limited to 640K of RAM), as well as multitasking less well behaved DOS programs with more memory use on an 80386 or better CPU.

OS/2 [7] 2.x was described as “A Better DOS than DOS, a Better Windows than Windows”. That claim seemed accurate to me. I could run DOS VM86 sessions under OS/2 which could do things that even Desqview couldn’t manage (such as having a non-graphical DOS session with 716K of base memory in one window and a graphical DOS session in another). I could also run combinations of Windows programs that could not run under MS Windows (such as badly written windows programs that needed Real Mode as well as programs that needed the amount of memory that only Standard or Enhanced mode could provide).

Back to Bill Gates, I recently read a blog post Eight Years of Wrongness [5] which described how Steve Ballmer has failed MS stockholders by his poor management. It seems that he paid more attention to fighting Linux, implementing Digital Restrictions Management (DRM), and generally trying to avoid compatibility with other software than to actually making money. While this could be seen as a tribute to Bill Gates (Steve Ballmer couldn’t do the job as well), I think that Bill would have made the same mistakes for the same reasons. MS has always had a history of treating it’s customers as the enemy.

Jeff suggests that we should learn from MS that the freedom to tinker is important as is access to our data. These are good points but another important point is that we need to develop software that does what users want and acts primarily in the best interests of the users. Overall I think that free software is quite well written in regard to acting on behalf of the users. The issue we have is in determining who the “user” is, whether it’s a developer, sys-admin, or someone who wants to just play games and do some word-processing.

The New DNS Mess

The Age has an interesting article about proposed DNS changes [1].

Apparently ICANN is going to sell top level DNS names and a prediction has been made that they will cost more than $100,000 each. A suggestion for a potential use of this would be to have cities as top level names (a .paris TLD was given as an example). The problem with this is that they are not unique. Countries that were colonised in recent times (such as the US and Australia) have many names copied from Europe. It will be interesting to see how they plan to determine which of the cities registers names, for the .paris example I’m sure that the council of Paris Illinois [2] would love to register it. Does the oldest city win an international trademark dispute over a TLD?

The current situation is that French law unambiguously determines who gets to register paris.fr and someone who sees the URL will have no confusion as to what it means (providing that they know that fr is the ISO country code for France).

As well as city names there are region names which are used for products. Australian vineyards produce a lot of sparkling wine that they like to call Champagne and a lot of fortified wine that they like to call Port. There are ongoing battles about how these names can be used and it seems likely to me that the Australian wine industry will change to other terms. But in the mean-time it would be interesting if .champagne and .port were registered by Australian companies. The fuss that would surely cause would probably give enough free publicity to the Australian wine industry to justify an investment of $200,000 on TLDs.

The concern that is cited by business people (including the client who forwarded me the URL and requested my comments) is that of the expense of protecting a brand. Currently if you have a company named “Example” you can register example.com, example.net, and example.org if you are feeling enthusiastic. Then if you have a significant presence in any country you could register your name in the DNS hierarchy for that country (large companies try to register their name in every country – for a multinational registering ~200 domains is not really difficult or expensive). But if anyone can create a new TLD (and therefore if new ones are liable to be created at any time) it becomes much more difficult. For example if a new TLD was created every day then a multi-national corporation would need to assign an employee to work full-time on investigating the new TLDs and deciding which ones to use. A small company that has an international presence (IE an Internet company) would just lose a significant amount of control over their name.

I don’t believe that this is as much of a concern as some people (such as my client) do. Currently I could register a phone line with a listed name indicating that it belongs to the Melbourne branch of a multi-national corporation. I don’t expect that Telstra would stop me, but the benefit from doing this would be minimal (probably someone who attempted fraud using such means would not gain much and would get shut down quickly). I don’t think that a DNS name registered under a .melbourne TLD would cause much more harm than a phone number listed in the Melbourne phone book. Incidentally for readers from the US, I’m thinking of Melbourne in Australia not a city of the same name in the US – yet another example of a name conflict.

Now I believe that it would be better if small companies didn’t use .com domains. The use of country specific names relevant to where they work are more appropriate and technically easier to implement. I don’t regret registering coker.com.au instead of some name in another country or in the .com hierarchy. Things would probably be working better right now if a .com domain name had always cost $100,000 and there were only a few dozen companies that had registered them. But we have to go with the flow sometimes, so I have registered RussellCoker.com.

Now when considering the merit of an idea we should consider who benefits and who (if anyone) loses. Ideally we would choose options that provide benefits for many people and losses for few (or none). In this case it seems that the suggested changes would be a loss for corporations that want to protect their brand, a loss for end-users who just want to find something without confusion, and provide more benefits for domain-squatters than anyone else.

Maybe I should register icann.port and icann.champagne if those TLDs are registered in Australia and impersonate ICANN. ;)

Kernel Security vs Uptime

For best system security you want to apply kernel security patches ASAP. For an attacker gaining root access to a machine is often a two step process, the first step is to exploit a weakness in a non-root daemon or take over a user account, the second step is to compromise the kernel to gain root access. So even if a machine is not used for providing public shell access or any other task which involves giving user access to potential hostile people, having the kernel be secure is an important part of system security.

One thing that gets little consideration is the overall effect of applying security updates on overall uptime. Over the last year there have been 14 security related updates (I count a silent data loss along with security issues) to the main Debian Etch kernel package. Of those 14, it seems that if you don’t use DCCP, NAT for CIFS or SNMP, IA64, the dialout group, then you will only need to patch for issues 2, 3 (for SMP machines), 4, 5, 7 (sound drivers get loaded on all machines by default), 9, 10, 11, 12, 13, and 14.

This means 11 reboots a year for SMP machines and 10 a year for uni-processor machines. If a reboot takes three minutes (which is an optimistic assumption) then that would be 30 or 33 minutes of downtime a year due to kernel upgrades. In terms of uptime we talk about the number of “nines”, where the ideal is generally regarded as “five nines” or 99.999% uptime. 33 minutes of downtime a year for kernel upgrades means that you get 99.993% uptime (which is “four nines”). If a reboot takes six minutes (which is not uncommon for servers) then it’s 99.987% uptime (“thee nines”).

While it doesn’t seem likely to affect the number of “nines” you get, not using SMP has the potential to avoid future security issues. So it seems that when using a Xen (or other virtualisation technology) assigning only one CPU to the DomUs that don’t need any more could improve uptime for them.

For Xen Dom0’s which don’t have local users or daemons, don’t use DCCP, NAT for CIFS or SNMP, wireless, CIFS, JFFS2, PPPoE, bluetooth, H.323 or SCTP connection tracking, then only issue 11 applies. However for “five nines” you need to have 5 minutes of downtime a year or less. It seems unlikely that a busy Xen server can be rebooted in 5 minutes as all the DomUs need to have their memory saved to disk (writing out the data to disk and reading it back in after a reboot will probably take at least a couple of minutes) or they need to be shutdown and booted again after the Dom0 is rebooted (which is a good procedure if the security fix affects both Dom0 and DomU use), and such shutdowns and reboots of DomU’s will take a lot of time.

Based on the past year, it seems that a system running as a basic server might get “four nines” if configured for a fast boot (it’s surprising that no-one seems to be talking about recent improvements to the speed of booting as high-availability features) and if the boot is slower then you are looking at “three nines”. For a Xen server unless you have some sort of cluster it seems that “five nines” is unattainable due to reboot times if there is one issue a year, but “four nines” should be easy to get.

Now while the 14 issues over the last year for the kernel seems likely to be a pattern that will continue, the one issue which affects Xen may not be representative (small numbers are not statistically significant). I feel confident in predicting a need for between 5 and 20 kernel updates next year due to kernel security issues, but I would not be prepared to bet on whether the number of issues affecting Xen will be 0, 1, or 4 (it seems unlikely that there would be 5 or more).

I will write a future post about some strategies for mitigating these issues.

Here is my summary of the Debian kernel linux-image-2.6.18-6-686 (Etch kernel) security updates according to it’s changelog, they are not in chronological order, it’s the order of the changelog file:
Continue reading Kernel Security vs Uptime

Dell PowerEdge T105

Today I received a Dell PowerEDGE T105 for use by a client. My client had some servers for development and testing hosted in a server room at significant expense. They also needed an offsite backup of critical data. So I suggested that they buy a cheap server-class machine, put it on a fast ADSL connection at their home, and use Xen DomU’s on that for development, testing, and backup. My client liked the concept but didn’t like the idea of having a server in his home.

So I’m going to run the server from my home. I selected a Dell PowerEDGE tower system because it’s the cheapest server-class machine that can be purchased new. I have a slight preference for HP hardware but HP gear is probably more expensive and they are not a customer focussed company (they couldn’t even give me a price).

So exactly a week after placing my order I received my shiny new Dell system, and it didn’t work. I booted a CentOS CD and ran “memtest” and the machine performed a hard reset. When it booted again it informed me that the event log had a message, and the message was “Uncorrectable ECC Error” with extra data of “DIMM 2,2“. While it sucks quite badly to receive a new machine that doesn’t work, that’s about the best result you can hope for when you have a serious error on the motherboard or the RAM. A machine without ECC memory would probably just randomly crash every so often and maybe lose data (see my previous post on the relative merits of ECC RAM and RAID [1]).

So I phoned up Dell (it’s a pity that their “Packing Slip” was a low quality photocopy which didn’t allow me to read their phone number and that the shipping box also didn’t include the number so I had to look them up on the web) to get technical support. Once we had established that by removing the DIMMs and reinserting them I had proved that there was a hardware fault they agreed to send out a technician with a replacement motherboard and RAM.

I’m now glad that I bought the RAM from Dell. Dell’s business model seems to revolve around low base prices for hardware and then extremely high prices for extras, for example Dell sells 1TB SATA disks for $818.40 while MSY [1] has them for $215 or $233 depending on brand.

When I get the machine working I will buy two 1TB disks from MSY (or another company with similar prices). Not only does that save some money but it also means that I can get different brands of disk. I believe that having different brands of hard disk in a RAID-1 array will decrease the probability of having them both fail at the same time.

One interesting thing about the PowerEdge T105 is that Dell will only sell two disks for it, but it has four SATA connectors on the motherboard, one is used for a SATA DVD player so it would be easy to support three disks. Four disks could be installed if a PCIe SATA controller was used (one in the space for a FDD and another in the space for a second CD/DVD drive), and if you were prepared to go without a CD/DVD drive then five internal disks could probably work. But without any special hardware the space for a second CD/DVD drive is just begging to be used for a third hard disk, most servers only use the primary CD/DVD drive for installing the OS and I expect that the demand for two CD/DVD drives in a server is extremely low. Personally I would prefer it if servers shipped with external USB DVD drives for installing the OS. Then when I install a server room I could leave one or two drives there in case a system recovery is needed and use the rest for desktop machines.

One thing that they seem to have messed up is the lack of a filter for the air intake fan at the front of the case. The Opteron CPU has a fan that’s about 11cm wide which sucks in air from the front of the machine, in front of that fan there is a 4cm gap which would nicely fit a little sponge filter. Either they messed up the design or somehow my air filter got lost in transit.

Incidentally if you want to buy from Dell in Australia then you need to configure your OS to not use ECN (Explicit Congestion Notification [2] as the Dell web servers used for sales rejects all connections from hosts with ECN enabled. It’s interesting that the web servers used for providing promotional information work fine with ECN and it’s only if you want to buy that it bites you.

But in spite of these issues, I am still happy with Dell overall. Their machine was DOA, that happens sometimes and the next day service is good (NB I didn’t pay extra for better service). I expect that they will fix it tomorrow and I’ll buy more of their gear in future.

Update: I forgot to mention that Dell shipped the machine with two power cables. While two power cables is a good thing for the more expensive servers that have redundant PSUs, for a machine with only one PSU it’s a bit of a waste. For some time I’ve been collecting computer power cables faster than I’ve been using them (due to machines dying and due to clients who want machines but already have spare power cables). So I’ve started giving them away at meetings of my local LUG. At the last meeting I gave away a bag of power cables and I plan to keep taking cables to the meetings until people stop accepting them.

Safety of Child Seats

I have just watched an interesting lecture by Steven Levitt about car safety for children in the 2-6 age range [1]. The evidence he presents shows that the benefits for children in that age range are at best insignificant and that in some corner cases (EG rear impacts) the child seat may give a worse result than an adult seat belt!

He advocates a 5-point harness [2] for children in the 2-6 age range that is based on a standard adult seat and seems to be advocating a child “booster seat” integrated into the adult seat (which approximates the booster seats offered by some recent cars such as the VW Passat). He has a picture of a child in a child-sized 5-point harness to illustrate his point. But one thing that should be considered is the benefit of a 5 point harness for adults. Race car drivers use 5 point harnesses, I wonder how the probability of a race car dying during the course of their employment compares with the probability of an average adult dying while doing regular driving. I also wonder how a 5 point harness compares to a three point harness with a pre-tensioner, it seems quite possible that a 5 point harness would be cheaper and safer than the 3 point harness with pre-tensioner that is found in all the most expensive cars manufactured in the last few years.

He believes (based on tests with crash-test dummies) that part of the problem is that the child seat will move in an accident (it’s attached to a soft seat). It seems that one potential solution to this is to have child seats that firmly attach to some solid part of a vehicle. I had previously suggested that child seats which replace existing seats as an option from the manufacturer would be a good idea [3].

But there is a good option for making better child seats for existing vehicles. It is becoming common in the “people mover” market segment to design vehicles with removable seats. For example the Kia Carnival has three seats in the middle row which are removable and which attach to four steel bars in the floor. It should not be difficult to design a child seat which attaches to those bars and could therefore be plugged in to a Carnival in a matter of minutes. The Carnival is designed to have the mid row seats installed or removed easily and safely by someone who is untrained, while for comparison it is recommended that a regular child seat should only be installed by a trained professional (IE your regular mechanic can’t do it).

Car vs Public Transport to Save Money

I’ve just been considering when it’s best to drive and when it’s best to take public transport to save money. My old car (1999 VW Passat) uses 12.8L/100km which at $1.65 per liter means 21.1 cents per km on fuel. A new set of tires costs $900 and assuming that they last 20,000km will cost 4.5 cents per km. A routine service every 10,000Km will cost about $300 so that’s another 3 cents per km. While it’s difficult to estimate the cost per kilometer of replacing parts that wear out, it seems reasonable to assume that over 100,000Km of driving at least $20,000 will be spent on parts and the labor required to install them, this adds another 20 cents per km.

The total then would be 48.6 cents per km. The tax deduction for my car is 70 cents per km of business use, so if my estimates are correct then the tax deductions exceed the marginal costs of running a vehicle (the costs of registration, insurance, and asset depreciation however make the car significantly more expensive than that – see my previous post about the costs of owning a small car for more details [1]). So for business use the marginal cost after tax deductions are counted is probably about 14 cents per km.

Now a 2 hour ride on Melbourne’s public transport costs $2.76 (if you buy a 10 trip ticket). For business use that’s probably the equivalent cost to 20Km of driving. The route I take when driving to the city center is about 8Km, that gets me to the nearest edge of the CBD (Central Business District) and doesn’t count the amount of driving needed to find a place to park. This means the absolute minimum distance I would drive when going to the CBD would be 16Km. The distance I would drive on a return trip to the furthest part of the CBD would be almost exactly 20km. So on a short visit to the central city area I might save money by using my car if it’s a business trip and I tax-deduct the distance driven. A daily ticket for the public transport is equivalent to two 2 hour tickets (if you have a 10 trip ticket then if you use it outside the two hour period it becomes a daily ticket and uses a second credit). If I could park my car for an out of pocket expense of less than $2.76 (while I can tax-deduct private parking it’s so horribly expensive that it would cost at least $5 after deductions are counted) then I could possibly save money by driving. There were some 4 hour public parking spots that cost $2.

So it seems that for a basic trip to the CBD it’s more expensive to use a car than to take a tram when car expenses are tax deductible. For personal use a 5.7km journey would cost as much as a 2 hour ticket for public transport and a 11.4km journey would cost as much as a daily ticket. The fact that public transport is the economical way to travel for such short distances is quite surprising. In the past I had thought of using a tram ticket as an immediate cost while considering a short car drive as costing almost nothing (probably because the expense comes days later for petrol and years later for servicing the car).

Also while there is a lot of media attention recently about petrol prices, it seems that for me at least petrol is still less than half the marginal cost of running a car. Cars are being advertised on the basis of how little fuel they use to save money, but cars that require less service might actually save more money. There are many cars that use less fuel than a VW Passat, and also many cars that are less expensive to repair. It seems that perhaps the imported turbo-Diesel cars which are becoming popular due to their fuel use may actually be more expensive than locally manufactured small cars which have cheap parts.

Update: Changed “Km” to “km” as suggested by Lars Wirzenius.

Links June 2008

Paul Graham has recently published an essay titled How To Disagree [1]. One form that he didn’t mention is to claim that a disagreement is a matter of opinion. Describing a disagreement about an issue which can be proved as a matter of opinion is a commonly used method of avoiding the need to offer any facts or analysis.

Sam Varghese published an article about the Debian OpenSSL issue and quoted me [2].

The Basic AI Drives [3] is an interesting papar about what might motivate an AI and how AIs might modify themselves to better achieve their goals. It also has some insights into addiction and other vulnerabilities in human motivation.

It seems that BeOS [4] is not entirely dead. The Haiku OS project aims to develop an open source OS for desktop computing based on BeOS [5]. It’s not nearly usable for end-users yet, but they have vmware snapshots that can be used for development.

On my Document Blog I have described how to debug POP problems with the telnet command [6]. Some users might read this and help me fix their email problems faster. I know that most users won’t be able to read this, but the number of people who can use it will surely be a lot greater than the number of people who can read the RFCs…

Singularity tales is an amusing collection of short stories [7] about the Technological Singularity [8].

A summary of the banana situation [9]. Briefly describes how “banana republics” work and the fact that a new variety of the Panama disease is spreading through banana producing countries. Given the links between despotic regimes and banana production it’s surprising that no-one is trying to spread the disease faster. Maybe Panama disease could do for South America what the Boll weevil did for the south of the US [10].

Jeff Dean gives an interesting talk about the Google server architecture [11]. One thing I wonder about is whether they have experimented with increasing the chunk size over the years. It seems that the contiguous IO performance of disks has been steadily increasing while the seek performance has stayed much the same, and the dramatic increases in the amount of RAM you can get for any given amount of money over the last few years have been amazing. So it seems that now it’s possible to read larger chunks of data in the same amount of time and more easily store such large chunks in memory.

Solving Rubik’s Cube and IO Bandwidth

Solving Rubiks Cube by treating disk as RAM: Gene Cooperman gave an interesting talk at Google about how he proved that Rubik’s Cube can be solved in 26 moves and how treating disk as RAM was essential for this. The Google talk is on Youtube [1]. I recommend that you read the ACM paper he wrote with Daniel Kunkle first before watching the talk. Incidentally due to the resolution of Youtube it would have been good if the notes had less than 10 lines per screen.

Here is the main page for the Rubiks Cube project with source and math [2], note that I haven’t been interested enough to read the source but I’m including the link for reference.

The main concept is that modern disks can deliver up to 100MB/s (I presume that’s from the outer tracks, I suspect that the inner tracks wouldn’t deliver that speed) for contiguous IO. Get 50 disks running at the same speed and you get 5GB/s for contiguous IO which is a typical speed for RAM. Of course that RAM speed is for a single system while getting 50 disks running at that speed will require either a well-tuned system from SGI (who apparently achieved such speeds on a single process on a single system many years ago – but I can’t find a reference) or 5+ machines from anyone else. The configuration that Gene describes apparently involves a network of machines with one disk each, he takes advantage of hardware purchased for other tasks (where the disks are mostly wasted).

I believe that SGI sells Altix machines which can have enough RAM to store all that data. It is NUMA RAM, even the “slow” access to RAM on another NUMA node should be a lot faster in most cases for sequential access and when there are seeks the benefits of NUMA RAM over disk will be dramatic. Of course the cost of a large NUMA installation is also significant, while a set of 50 quad-core machines with 500G disks is affordable by some home users.

TED – Defining Words

I recently joined the community based around the TED conference [1]. The TED conference is expensive ($6000US) and has a long waiting list (the 2009 conference is sold out) so it seems quite unlikely that I will ever attend one. But signing up to the web site is easy and might offer some benefit.

optional words to define yourself as a TED member

One thing that interested me was that part of the sign-up process requests that you select up to 10 words from the list above to describe yourself. Some of the words seem almost mandatory for anyone who is interested in what TED has to offer (I find it difficult to imagine someone declaring that they are not an “activist” or a “change agent” while wanting to be involved with TED in any way). The range of words also seems quite strange, there are some professions mixed with educational status, marital status, and religion. The way it is laid out would tend to encourage people to make a decision as to which aspects of their life are more important, is career, marital status, or religion more important?

Given the nature of TED I’m wondering whether the intentionally did a bad job of that part of the site design to encourage people to think about these issues.

It seems to me that a better way of doing this would be to provide a few suggestions and allow people to fill in text fields with their own values. Even defining marital status can require many choices and there is no limit to the number of religions and careers. If you try to make a comprehensive list then you will end up doing what British Airways did with their frequent flyer membership application page [2]. Even disregarding the choices of spelling (EG Admiral vs Admiraal and Brig Gen vs Brig General vs Brigadier General) the British Airways list is unreasonably long, and I doubt that anyone who deserves the title “Her Magesty” or “His Holyness” is going to be interested in frequent flyer points.

Also I wonder which of the entries in the TED list would be most commonly accepted by the free software community. It seems that activist and technologist would be quite popular.

Here is the list in text form for those who can’t get the picture above:
Continue reading TED – Defining Words