The Latest Dick Smith Catalogue

I was just reading the latest catalogue from Dick Smith Electronics (a chain of computer stores in Australia).

The first interesting thing that I noticed is that laptops are cheaper than desktops in all categories. For any combination of CPU power and RAM in a desktop system I can see a laptop advertised with similar specs at a lower price. Of course you won’t get such a big display in a laptop, but big displays don’t always work well. I just read an interesting review of LCD display technology [1] which states (among other things) that TN panels (which provide poor colors and a limited viewing angle) are used in all current 22 inch monitors! They state that the Dell 2007WFP (which I own) comes in two versions, I was fortunate to get the one that doesn’t suck. Based on that review I think I’ll refrain from all further monitor purchases until the technology gets sorted out and it becomes possible to reliably buy the better monitors at a decent price. The most expensive desktop system that Dick Smith advertised in their catalogue has a 22 inch monitor.

It seems that with desktop systems being more expensive an increasing number of home users will use laptops instead, which will of course change the economics of manufacture. Maybe the desktop computer is about to die out and be replaced by laptops, PDAs, and mobile phone type devices (blackberries etc).

Another interesting thing is an advert for a LASER pointer (it seems that they haven’t been banned as “terrorist weapons” yet). Being on special for a mere $27 is not the interesting thing, what is interesting is that the advert claims “projects up to 500m indoors“. I’m sure it will be handy if I ever have to give a presentation at the Airbus factory. But otherwise it seems quite unlikely that I will ever get an opportunity for a 500m indoor space.

The prices on digital cameras have been dropping consistently for some time. Now they are selling a Samsung S860 (8.1MP with 3* optical zoom) for $98. This is (according to the specs at least) a very powerful camera for a price that most people won’t think twice about. I expect that an increasing number of people will buy new digital cameras every year the way white-box enthusiasts buy new motherboards! Hopefully people will use services such as Freecycle [2] to dispose of all their old cameras, to both avoid pollution and get cameras into the hands of more people.

Very few monitors are being sold with resolutions greater than 2MP (1680*1050 is the highest you can get for a reasonable price). So an 8MP camera allows significant scope for cropping and resizing an image before publishing it on the web. Even the 4MP cameras that were on sale a few years ago (and which are probably being discarded now) are more than adequate for such use.

Combat Wasps

One of the many interesting ideas in Peter F. Hamilton’s Night’s Dawn series [1] is that of Combat Wasps. These are robots used in space combat which may be armed with some combination of projectile weapons, MASERs, thermo-nuclear and anti-matter weapons.

In a lot of science fiction the space combat is limited to capital ships, a large source of this problem is technological issues such as the Star Trek process of making models of ships – it’s too expensive and time consuming to make lots of small models. Shows such as Babylon 5 [2] have fighters which make more sense. Sustaining life in space is difficult at the best of times and it seems likely for battles in space to have few if any survivors. So sending out fighters allows the capital ships to have a chance to survive. I suspect that a major motivating factor in the space battles in Babylon 5 was making it fit on a TV screen. Dramatic TV portrayal of small groups of fighters engaging in a battle is an art that has been perfected over the course of 80+ years. It’s about individuals being shown, whether it’s riders on horseback, pilots of biplanes, or space pilots, it’s much the same.

But a reasonable analysis of the facts suggests that without some strange religious motive adopted by all parties in a war (as used in Dune [3]) the trend in warfare is to ever greater mechanisation.

So while a medium size starship might be able to carry dozens or even hundreds of fighter craft, if using small robotic craft then thousands of fighters could be carried.

So the issue is how to effectively use such robots. It seems likely that an effective strategy would involve large numbers of robots performing different tasks, some would detonate thermo-nuclear weapons to remove enemies from an area while others would prepare to advance into the breach. The result would be a battle lasting seconds that involves large numbers of robots (too many to focus on in a group) while each robot matters to little that there’s no interest in following one. Therefore it just wouldn’t work on TV and in a book it’s given a couple of sentences to describe what would have been an epic battle if humans had done anything other than press the launch buttons.

One of the many things I would do if I had a lot more spare time would be to write a Combat Wasp simulator. There are already quite a number of computer games based on the idea of writing a program to control a robot and then having the robots do battle. This would be another variation on the theme but based in space.

In a comment on my previous post about programming and games for children [4], Don Marti suggests that a RTS game could allow programming the units. It seems to me that the current common settings for controlling units in RTS games (attack particular enemies, attach whichever enemies get in range, patrol, move to location, retreat, and defend other units or strategic positions) are about as complex as you can get without getting to the full programming language stage. Then of course if you have any real programming language for a unit then changing it takes more time than an RTS game allows, and if the programming is good then there won’t be much for a human to do during the game anyway. So I can’t imagine much potential for anything between RTS and fully programmed games.

There is some interesting research being conducted by the US military in simulating large numbers of people in combat situations. I think that the techniques in question could be more productively used in determining which of the various science fiction ideas for space combat could be most effectively implemented.

Some RAID Issues

I just read an interesting paper titled An Analysis of Data Corruption in the Storage Stack [1]. It contains an analysis of the data from 1,530,000 disks running at NetApp customer sites. The amount of corruption is worrying, as is the amount of effort that is needed to detect them.

NetApp devices have regular “RAID scrubbing” which involves reading all data on all disks at some quiet time and making sure that the checksums match. They also store checksums of all written data. For “Enterprise” disks each sector stores 520 bytes, which means that a 4K data block is comprised of 8 sectors and has 64 bytes of storage for a checksum. For “Nearline” disks 9 sectors of 512 bytes are used to store a 4K data block and it’s checksum. These 64byte checksum includes the identity of the block in question, the NetApp WAFL filesystem writes a block in a different location every time, this allows the storage of snapshots of old versions and also means that when reading file data if the location that is read has data from a different file (or a different version of the same file) then it is known to be corrupt (sometimes writes don’t make it to disk). Page 3 of the document describes this.

Page 13 has an analysis of error location and the fact that some disks are more likely to have errors at certain locations. They suggest configuring RAID stripes to be staggered so that you don’t have an entire stripe covering the bad spots on all disks in the array.

One thing that was not directly stated in the article is the connection between the different layers. On a Unix system with software RAID you have a RAID device and a filesystem layer on top of that, and (in Linux at least) there is no way for a filesystem driver to say “you gave me a bad version of that block, please give me a different one”. Block checksum errors at the filesystem level are going to be often caused by corruption that leaves the rest of the RAID array intact, this means that the RAID stripe will have a mismatching checksum. But the RAID driver won’t know which disk has the error. If a filesystem did checksums on metadata (or data) blocks and the chunk size of the RAID was greater than the filesystem block size then when the filesystem detected an error a different version of the block could be generated from the parity.

NetApp produced an interesting guest-post on the StorageMojo blog [2]. One point that they make is that Nearline disks try harder to re-read corrupt data from the disk. This means that a bad sector error will result in longer timeouts, but hopefully the data will be returned eventually. This is good if you only have a single disk, but if you have a RAID array it’s often better to just return an error and allow the data to be retrieved quickly from another disk. NetApp also claim that “Given the realities of today’s drives (plus all the trends indicating what we can expect from electro-mechanical storage devices in the near future) – protecting online data only via RAID 5 today verges on professional malpractice“, it’s a strong claim but they provide evidence to support it.

Another relevant issue is the size of the RAID device. Here is a post that describes the issue of the Unrecoverable Error Rate (UER) and how it can impact large RAID-5 arrays [3]. The implication is that the larger the array (in GB/TB) the greater the need for RAID-6. It has been regarded for a long time that a larger number of disks in the array drove a greater need for RAID-6, but the idea that larger disks in a RAID array gives a greater need for RAID-6 is a new idea (to me at least).

Now I am strongly advising all my clients to use RAID-6. Currently the only servers that I run which don’t have RAID-6 are legacy servers (some of which can be upgraded to RAID-6 – HP hardware RAID is really good in this regard) and small servers with two disks in a RAID-1 array.

Future Video Games

I just watched an interesting TED.com talk about video games [1]. The talk focussed to a large degree on emotional involvement in games, so it seems likely that there will be many more virtual girlfriend services [2] (I’m not sure that “game” is the correct term for such things) in the future. The only reference I could find to a virtual boyfriend was a deleted Wikipedia page for V-Boy, but I expect that they will be developed soon enough. I wonder if such a service could be used by astronauts on long missions. An advantage of a virtual SO would be that there is no need to have a partner who is qualified, and if a human couple got divorced on the way to Mars then it could be a really long journey for everyone on the mission.

VR training has been used for a long time in the airline industry (if you have never visited an airline company and sat in a VR trainer for a heavy passenger jet then I strongly recommend that you do it). It seems that there are many other possible uses for this. The current prison system is widely regarded as a training ground for criminals, people who are sent to prison for minor crimes come out as hardened criminals. I wonder if a virtual environment for prisoners could do some good. Instead of prisoners having to deal with other prisoners they could deal with virtual characters who encourage normal social relationships, prisoners who didn’t want to meet other prisoners could be given the option of spending their entire sentence in “solitary confinement” with virtual characters, multi-player games, and Internet access if they behave well. Game systems such as the Nintendo Wii [3] would result in prisoners getting adequate exercise, so after being released from a VR prison it seems likely that the ex-con would be fitter, healthier, and better able to fit into normal society than most parolees. Finally it seems likely that someone who gets used to spending most all their spare time playing computer games will be less likely to commit crimes.

It seems to me that the potential for the use of virtual environments in schools is very similar to that of prisons, for similar reasons.

Update: Currently Google Adsense is showing picture adverts on this page for the “Shaiya” game, the pictures are of a female character wearing a bikini with the caption “your goddess awaits”. This might be evidence to support my point about virtual girlfriends.

The Problem is Too Many Remote Controls

I am often asked for advice about purchasing TVs and consumer electronics. Not that I am any great expert in those areas, but my general experience in specifying and purchasing electronics goods in related to my computer work does translate to other areas (and I know where to find advice on the net).

As part of this I get to closely observe what happens when people install new home entertainment systems. I have not observed anyone who uses anywhere near the full function of their system, even using the bare minimum functionality in all areas is very rare.

I believe that the first problem with this is the input devices. A remote control is designed around the idea of one button for each high-level command, this can be compared to languages such as Chinese and Japanese which have one character per word. In terms of language evolution it seems that the benefits of having multiple characters became apparent and widely accepted thousands of years ago.

Now for a simple input device having one button for each high-level operation makes sense. For the basic functions of a VCR you have PLAY, STOP, Fast-Forward, Rewind, Record, and Eject – 6 buttons is quite reasonable. But then you want slightly advanced features such as “record from 8PM to 9PM on channel A and then record from 10PM to 11PM on channel B” and things become unreasonably difficult. More than 10 years ago I was recommending that people just buy 5 hour tapes and press record before leaving home, getting the full functionality out of a VCR was just too hard. Jokes are often made about people who leave their VCR flashing 12:00 (because it’s too difficult to set the time), I only set the time on a VCR to stop the flashing (flashing lights annoy me).

Since programmable VCRs became popular things have only continued to develop. Now it’s not uncommon to have separate remote controls for the TV, VCR, DVD player, and the Cable TV box – a total of four remote controls! This is obviously a problem, and the solution that some people are adopting is to have a single more complex remote control – this is an example of problems not being solved by the same type of thinking that caused them.

One of the attempts to solve this is to have everything controlled by a PVR [1]. This means that you have one device recording content (maybe from multiple sources or multiple channels), playing recorded content or live content, and maybe playing DVDs and CDs. Of course then you have a complex remote control for that device which just shifts the problem.

To solve these problems we need to consider the entire problem space from a high-level. We start with the signal that is displayed by the TV, it can come from cable TV, digital TV, analogue TV, VCR, or DVD – these input sources have names such as “Composite”, “RCA1”, “RCA2”, “RCA3”, “DTV”, and “ATV”. Often people have written instructions near their TV to map these names to what they do.

Obviously the TV needs to be able to be programmed with human friendly names, and as these names are not of much use on their own it should be possible to use compound names and abbreviations. If I want to watch ABC (the Australian Government sponsored TV channel) then I would rather type “ABC” on a keyboard and then have the entertainment system determine whether that maps to “cable:abc”, “dtv:abc”, or “atv:abc” depending on what options are available. The current process of choosing an input source (such as RCA1 mapping to cable TV) and then choosing a channel (102 mapping to ABC) means among other things that it is essentially impossible for a guest to control the TV.

A further problem is the lack of available output. While it might seem that a large-screen HDTV has adequate output available, it’s often the case that you don’t want to stop watching one show while trying to find another. When something of wide interest is happening (another war or an election) it’s common for several people in one room to want to watch the news. Having everyone stop watching while someone goes through menus to find a different news channel is not well accepted. It seems to me that we need a separate output mechanism for controlling the entire system from that which is used for the main display.

This of course requires integration between all parts of the entertainment system, which shouldn’t be that difficult given the complexity of all the components (every one of which has more compute power than a typical server of 20 years ago). It is currently quite common for PVRs and DVD players to support playing videos from USB and SD devices, so the next logical step is to get an Ethernet port in each device (and maybe have Ethernet switches built in to some of the high-end entertainment hardware). Then XML transferred over HTTP could be used for controlling the different components from a single web server which provides a web-based GUI. While a random guest would not get much functionality out of my TV configuration (or that of most homes where I have assisted in the configuration of new hardware), they shuold be able to use a web-based GUI with ease.

For controlling the entire system a NetBook [2] computer such as the EeePC should do well. As high-end TVs cost over $5,000 an EeePC which costs $500 (list price – surely less in bulk) could easily be bundled without much impact on the overall price in the same way that companies selling computers in the $1,000,000 price range used to bundle $10,000 workstations to manage them. An EeePC has abut the same size and mass as four remote controls (so it wouldn’t be inconvenient to use). Also the built-in wifi could be used for the remote control (wires are inconvenient and Infra-Red has limited bandwidth so probably wouldn’t provide a good web based GUI). Also someone who wanted to save some money could instead choose to use an old laptop on a coffee table (any web browser would do). I have deployed Linux desktop machines for some people who had no trouble using it, but who then had trouble using TVs that I configured for them – so I conclude that a modern Linux distribution is significantly more user friendly than the equipment that you find in a modern lounge room.

Cable TV companies all seem to be ISPs and often provide compelling value as an ISP service if you want to use the TV service. So it seems that the “Cable Modem” could be built in to the Cable TV box for added savings in hardware purchase and less boxes to manage in the home. This of course would increase the value of a NetBook [2] as a remote control as it could also be used for general Internet access at the same time. TV shows often promote their web sites to their customers and TV advertising also often includes a URL. If the URLs were distributed by VideoText then that would provide more information for the viewers, a better reach for advertisers and people who create the TV shows, and it when it became popular it would save us fom those stupid scrolling news tickers that appear on the bottom of most cable new shows!

Some of these benefits are currently achieved by people running MythTV. The first problem with Myth is that it is still largely stuck to the bad old UI paradigm (which of course can be fixed as it’s free software). The next problem is that getting a result of comparable quality to the dedicated hardware is expensive, difficult, or both. A regular PC makes more noise than you desire in your lounge room and configuring a Myth machine for your choice of DTV hardware and the frequencies used in your location is a pain. You can buy a preconfigured machine which solves these problems, but it will be more expensive. For most lounge rooms the cheap Chinese hardware is what will be used.

Ultimately I believe that TV will be killed by the Internet. The range of content on the net is increasing and the rate of acceleration is also increasing. TV seems to have not made any significant changes since the introduction of cable and satellite TV (both of which happened a long time ago and were not significant changes). But I don’t expect it to happen soon. I predict another 10 years of the current TV business model. I believe that better integration of home entertainment hardware so that it can obey simpler commands from the user in a more friendly manner while not getting in the way of the main purpose (displaying the TV picture) has the potential to extend the life of the current TV business model.

Not that I care whether the TV industry dies sooner or later. I just want to escape from having to provide phone-support to people who can’t get their TV to work correctly!

Laptop Computer Features

It’s not easy to choose a laptop, and part of the problem is that most people don’t seem to start from the use of the laptop. I believe that the following four categories cover the vast majority of the modern use of mobile computers.

  1. PDA [1] – can be held in one hand and generally uses a touch-screen. They generally do not resemble a laptop in shape or design.
  2. Subnotebook [2] AKA Netbook [3] – a very small laptop that is designed to be very portable. Typically weighing 1KG or less and having multiple design trade-offs to give light weight (such as a small screen and a slow CPU) while still being more powerful than a PDA. The EeePC [4] is a well known example.
  3. Laptop [5] – now apparently defined to mean a medium size portable computer, light enough to be carried around but with a big enough screen and keyboard that many people will be happy to use them for 8 hours a day. The word is also used to mean all portable computers that could possibly fit on someone’s lap.
  4. Desktop Replacement [6] – a big heavy laptop that is not carried much.

There is some disagreement about the exact number of categories and which category is most appropriate for each machine. There is a range of machines between the Subnotebook and Laptop categories. There is some amount of personal preference involved in determining which category a machine might fall in. For example I find a Thinkpad T series to fit into the “Laptop” category (and I expect that most people would agree with me). But when comparing the weight and height of an average 10yo child to an adult it seems that a 10yo would find an EeePC to be as much effort to carry as a T series Thinkpad is for an adult.

It seems to me that the first thing that you need to do when choosing a laptop is to decide which of the above categories is most appropriate. While the boundaries between the categories are blurry and to some extent are limited by personal preference it’s an easy second step to determine which machines fit the category you have selected (in your opinion) once you have made a firm decision on the category. It’s also possible to choose a half-way point, for example if you wanted something on the border of the “Laptop” and NetBook categories then a Thinkpad X series might do the job.

The next step of course is to determine which OSs and applications you want to run. There are some situations where the choice of OS and/or applications may force you to choose a category that has more powerful hardware (a CPU with more speed or features, more RAM, or more storage). For example a PDA generally won’t run a regular OS well (if at all) due to the limited options available for input devices and the very limited screen resolution. Even a NetBook has limitations as to what software runs well (for example many applications require a minimum resolution of 800×600 and don’t work well on an EeePC 701). Also Xen can not be used on the low-end CPUs used in some NetBooks which lack PAE.

Once you have chosen a category you have to look for features which make sense for that category. A major criteria for a PDA is how fast you can turn it on, it should be possible to go from standby to full use in less than one second. Another major criteria is how long the battery lasts, it should compare to a mobile phone (several days on standby and 8 hours of active use). A criteria that is important to some people is the ability to use both portrait and landscape views for different actions (I use portrait for editing and landscape for reading).

A NetBook is likely to be used in many places and needs to have a screen that will work well in adverse lighting conditions (a shiny reflective screen is a really bad thing), it also needs to be reasonably resilient as it is going to get bumped if it is transported a lot (a solid state disk is a good feature). It should also be as light as possible while having enough hardware to run a regular OS (an EeePC 701 with 512M of RAM and 4G of storage is about the minimum hardware for running a regular distribution of Linux).

A desktop replacement needs to have all the features, lots of RAM, a fast CPU and video hardware, and a big screen – it also needs a good keyboard (test by typing your name several times). The “Laptop” category is much the same as the desktop replacement, but a bit smaller, a lot lighter, and better battery life.

It seems very difficult to give any specific advice as to which laptop to buy when the person who wants the advice has not chosen a category (which is often the case).

Noise in Computer Rooms

Some people think that you can recognise a good restaurant by the presence of obscure dishes on the menu or having high prices. The reality is that there are two ways of quickly identifying a good restaurant, one is the Michelin Guide [1] (or a comparable guide – if such a thing exists), the other is how quiet the restaurant is.

By a quiet restaurant I certainly don’t mean a restaurant with no customers (which may become very noisy once customers arrive). I mean a restaurant which when full will still be reasonably quiet. Making a restaurant quiet is not in itself a sufficient criteria to be a good restaurant – but it’s something that is usually done after the other criteria (such as hiring good staff and preparing a good menu) are met.

The first thing to do to make a room quiet is to have good carpet. Floor boards are easy to clean and the ratio of investment to lifetime is very good (particularly for hard wood), but they reflect sound and the movement of chairs and feet makes noise. A thick carpet with a good underlay is necessary to absorb sound. Booths are also good for containing sound if the walls extend above head height. Decorations on the walls such as curtains and thick wallpaper also absorb sound. A quiet environment allows people to talk at a normal volume which improves the dining experience.

It seems to me that the same benefits apply to server rooms and offices, with the benefit being more efficient work. I found it exciting when I first had my desk in a server room (surrounded by tens of millions of pounds worth of computer gear). But as I got older I found it less interesting to work in that type of environment just as I found it less interesting to have dinner in a noisy bar – and for the same reasons.

For a server room there is no escaping the fact that it will be noisy. But if the noise can be minimised then it will allow better communication between the people who are there and less distraction which should result in higher quality of work – which matters if you want good uptime! One thing I have observed is that physically larger servers tend to make less noise per volume and per compute power. For example a 2RU server with four CPUs seems to always make less noise than two 1RU servers that each have two CPUs. I believe that this is because a fan with a larger diameter can operate at a lower rotational speed which results in less bearing noise and the larger fans also give less turbulence. While it’s obvious that using fewer servers via virtualisation has the potential to avoid noise (both directly through fans and disks and indirectly through the cooling system for the server room [2]). A less obvious way of reducing noise is to swap two 1RU servers for one 2RU server – although my experience is that for machines in a similar price band, a 2RU server often has comparable compute power (in terms of RAM and disk capacity) to three or four 1RU servers.

To reduce noise both directly and indirectly it is a requirement to increase disk IO capacity (in terms of the number of random IOs per second) without increasing the number of spindles (disks). I just read an interesting Sun blog covering some concepts related to using Solid State Disks (SSDs) on ZFS for best performance [3]. It seems that using such techniques is one way of significantly increasing the IO capacity per server (and thus allowing more virtual servers on one physical machine) – it’s a pity that we currently don’t have access to ZFS or a similar filesystem for Linux servers (ZFS has license issues and the GPL alternatives are all in a beta state AFAIK). Another possibility that seems to have some potential is the use of NetApp Filers [4] for the main storage of virtual machines. A NetApp Filer gives a better ratio of IO requests per second to the number of spindles used than most storage array products due to the way they use NVRAM caching and their advanced filesystem features (which also incidentally gives some good options for backups and for detecting and correcting errors). So a set of 2RU servers that have the maximum amount of RAM installed and which use a NetApp Filer (or two if you want redundancy) for the storage with the greatest performance requirements should give the greatest density of virtual machines.

Blade servers also have potential to reduce noise in the server room. The most significant way that they do this is by reducing the number of power supplies, instead of having one PSU per server (or two if you want redundancy) you might have three or five PSUs for a blade enclosure that has 8 or more blades. HP blade enclosures support shutting down some PSUs when the blades are idling and don’t need much power (I don’t know whether blade enclosures from other vendors do this – I expect that some do).

A bigger problem however is the noise in offices where people work. It seems that the major responsible for this is the cheap cubicles that are used in most offices (and almost all computer companies). More expensive cubicles that are at almost head-height (for someone who is standing) and which have a cloth surface absorb sound better significantly improve the office environment, and separate offices are better still. One thing I would like to see is more use of shared desktop computers, it’s not difficult to set up a desktop machine with multiple video cards, so with appropriate software support (which is really difficult) you could have one desktop machine for two, or even four users which would save electricity and reduce noise.

Better quality carpet on the floors would also be a good thing. While office carpet wears out fast adding some underlay would not increase the long-term cost (it can remain as the top layer gets replaced).

Better windows in offices are necessary to provide a quiet working environment. The use of double-glazed windows with reflective plastic film significantly decreases the amount of heating and cooling that is required in the office. This would permit a lower speed of air flow for heating and cooling which means less noise. Also an office in a central city area will have a noise problem outside the building, again double (or even triple) glazed windows help a lot.

Some people seem to believe that an operations room should have no obstacles (one ops room where I once worked had all desks facing a set of large screens that displayed network statistics and the desks were like school desks with no dividers), I think that even for an ops room there should be some effort made to reduce the ambient noise. If the room is generally reasonably quiet then it should be easy to shout the news of an outage so that everyone can hear it.

Let’s assume for the sake of discussion that a quieter working environment can increase productivity by 5% (I think this is a conservative assumption). For an office full of skilled people who are doing computer work the average salary may be about $70,000, and it’s widely regarded that to factor in the management costs etc you should double the salary – so the average cost of an employee would be about $140,000. If there are 50 people in the office then the work of those employees has a cost of $7,000,000 per annum. A 5% increase in that would be worth $350,000 per annum – you could buy a lot of windows for that!

Efficiency of Cooling Servers

One thing I had wondered was why home air-conditioning systems are more efficient than air-conditioning systems for server rooms. I received some advice on this matter from the manager of a small server room (which houses about 30 racks of very powerful and power hungry servers).

The first issue is terminology, the efficiency of a “chiller” is regarded as the number of Watts of heat energy removed divided by the number of Watts of electricity consumed by the chiller. For example when using a 200% efficient air cooling plant, a 100W light bulb is rated as being a 150W heat source. 100W to Heat it, 50W from the cooling plant to cool it.

For domestic cooling I believe that 300% is fairly common for modern “split systems” (it’s the specifications for the air-conditioning on my house and the other air-conditioners on display had similar ratings). For high-density server rooms with free air cooling I have been told that a typical efficiency range is between 80% and 110%! So it’s possible to use MORE electricity on cooling than on running the servers!

One difficulty in cooling a server room is that the air often can’t flow freely (unlike a big open space such as the lounge room of your house). Another is the range of temperatures and the density of heat production in some parts (a 2RU server can dissipate 1000W of heat in a small space). These factors can be minimised by extracting hot air at the top and/or rear of racks and forcing cold air in the bottom and/or the front and by being very careful when planning where to place equipment. HP offers some services related to designing a server room to increase cooling efficiency, one of the services is using computational fluid dynamics to simulate the air-flow in the server-room [1]! CFD is difficult and expensive (the complete package from HP for a small server room costs more than some new cars), I believe that the fact that it is necessary for correct operation of some server rooms is an indication of the difficulty of the problem.

The most effective ways of cooling servers involve tight coupling of chillers and servers. This often means using chilled water or another liquid to extract the heat. Chilled water refrigeration systems largely remove the problem of being unable to extract the heat from the right places, but instead you have some inefficiency in pumping the water and the servers are fixed in place. I have not seen or heard of chilled water being used for 2RU servers (I’m not saying that it doesn’t get used or that it wouldn’t make sense – merely that I haven’t seen it). When installing smaller servers (2RU and below) there is often a desire to move them and attaching a chilled-water cooling system would make such a move more difficult and expensive. When a server weighs a ton or more then you aren’t going to move it in a hurry (big servers have to be mostly disassembled before the shell can be moved, and the shell might require the efforts of four men to move it). Another issue related to water cooling is the weight. Managing a moderate amount of water involves a lot of heavy pipes (a leak would be really bad) and the water itself can weigh a lot. A server room that is based around 20Kg servers might have some issues with the extra weight of water cooling (particularly the older rooms), but a server room designed for a single rack that weighs a ton can probably cope.

I have been told that the cooling systems for low density server rooms are typically as efficient as those used for houses, and may even be more efficient. I expect that when designing an air-conditioner the engineering trade-offs when designing for home use favor low purchase price. But someone who approves the purchase of an industrial cooling system will be more concerned about the overall cost of operations and will be prepared to spend some extra money up-front and recover it over the course of a few years. The fact that server rooms run 24*7 also gives more opportunity to recover the money spent on the purchase (my home A-C system runs for about 3 months a year for considerably less than 24 hours a day).

So it seems that the way to cool servers efficiently is to have low density server rooms (to the largest extent possible). One step towards this goal would be to have servers nearer the end users. For example having workgroup servers near the workgroup (instead of in the server room). Of course physical security of those servers would be more challenging – but if all the users have regular desktop PCs that can be easily 0wned then having the server for them in the same room probably doesn’t make things any worse. Modern tower servers are more powerful than rack mounted servers that were available a few years ago while also being very quiet. A typical rack-mounted server is not something you would want near your desk, but one of the quiet tower servers works quite well.

Variable Names

For a long time I have opposed single letter variable names. Often I see code which has a variable for a fixed purpose with a single letter name, EG “FILE *f;“, the problem with this is that unless you choose a letter such as ‘z‘ which has a high scrabble score (and probably no relation to what your program is doing) then it will occur in other variable names and in reserved words for the language in question. As a significant part of the time spent coding will involve reading code so even for programmers working on a project a useful amount of time can be saved by using variable names that can easily by found by a search program. Often it’s necessary to read source code to understand what a system does – so that is code reading without writing.

With most editors and file viewing tools searching for a variable with a single character name in a function (or source file for a global variable) is going to be difficult. Automated searching is mostly useless, probably the best option is to have your editor highlight every instance and visually scan for the ones with are not surrounded by brackets, braces, parenthesis, spaces, commas, or whatever else is not acceptable in a variable name in the language in question.

Of course if you have a syntax highlighting editor then it might parse enough of the language to avoid this. But the heavier editors are not always available. Often I edit code on the system where the crash occurs (it makes it easier to run a debugger). Installing one of the heavier editors is often not an option for such a task (the vim-full Debian/Lenny package for AMD64 has dependencies that involve 27M of packages files to download and would take 100M of disk space to install quite a lot to ask if you just want to edit a single source file). Incidentally I am interested in suggestions for the best combination of features and space in a vi clone (color syntax highlighting is a feature I desire).

But even if you have a fancy editor, there is still the issue of using tools such as less and grep to find uses of variables. Of course for some uses (such as a loop counter) there is little benefit in using grep.

Another issue to consider is the language. If you write in Perl then a search for \$i should work reasonably well.

One of the greatest uses of single letter variable names is the ‘i‘ and ‘j‘ names for loop counters. In the early days of computing FORTRAN was the only compiled language suitable for scientific tasks and it had no explicit way of declaring variables, if a variable name started with i, j, k, l, m, or n then it was known to be an integer. So i became the commonly used name for a loop counter (the first short integer variable name). That habit has been passed on through the years so now many people who have never heard of FORTRAN use i as the name for a loop counter and j as the name for the inner loop in nested loops. [I couldn’t find a good reference for FORTRAN history – I’ll update this post if someone can find one.]

But it seems to me that using idx, index, or even names such as item_count which might refer to the meaning of the program might be more efficient overall. Searching for instances of i in a program is going to be difficult at the best of times, even without having multiple loops (either in separate functions or in the same function) with the same variable name.

So if there is to be a policy for variable names for counters, I think that it makes most sense to have multiple letters in all variable names to allow for easy grepping, and to have counter names which apply to what is being counted. Some effort to give different index names to different for/while loops would make sense too. Having two different for loops with a counter named index is going to make things more difficult for someone who reads the code. Of course there are situations where two loops should have the same variable, for example if one loop searches through an array to find a particular item and then the next loop goes backward through the array to perform some operation on all preceding items then it makes sense to use the same variable.

Label vs UUID vs Device

Someone asked on a mailing list about the issues related to whether to use a label, UUID, or device name for /etc/fstab.

The first thing to consider is where the names come from. The UUID is assigned automatically by mkfs or mkswap, so you have to discover it after the filesystem or swap space has been made (or note it during the mkfs/mkswap process). For the ext2/3 filesystems the command “tune2fs -l DEVICE” will display the UUID and label (strangely mke2fs uses the term “label” while the output of tune2fs uses the term “volume name“). For a swap space I don’t know of any tool that can extract the UUID and name. On Debian (Etch and Unstable) the file command does not display the UUID for swap spaces or ext2/3 filesystems and does not display the label for ext2/3 filesystems. After I complete this blog post I will file a bug report.

If you are using a version of Debian earlier than Lenny (or a version of Unstable with this bug fixed) then you will be able to easily determine the label and UUID of a filesystem or swap space. Other than that the inconvenience of determining the UUID and label will be a reason for not using them in /etc/fstab (keep in mind that sys-admin work sometimes needs to be done at 3AM).

One problem with mounting by UUID or label is that it doesn’t work well with snapshots and block device backups. If you have a live filesystem on /dev/sdc and an image from a backup on /dev/sdd then there is a lot of potential for excitement when mounting by UUID or label. Snapshots can be made by a volume manager (such as LVM), a SAN, or an iSCSI server.

Another problem is that if a file-based backup is made (IE tar or cpio) then you lose the UUID and label. tune2fs allows setting the UUID, but that seems like a potential recipe for disaster. So this means that if mounting by UUID then you would potentially need to change /etc/fstab after doing a full filesystem restore from a file-based backup, this is not impossible but might not be what you desire. Setting the label is not difficult, but it may be inconvenient.

When using old-style IDE disks the device names were of the form /dev/hda for the first disk on the first controller (cable) and /dev/hdd for the second disk on the second controller. This was quite unambiguous, adding an extra disk was never going to change the naming.

With SCSI disks the naming issue has always been more complex, and which device gets the name /dev/sda was determined by the order in which the SCSI HAs were discovered. So if a SCSI HA which had no disks attached suddenly had a disk installed then the naming of all the other disks would change on the next boot! To make things more exciting Fedora 9 is using the same naming scheme for IDE devices as for SCSI devices, I expect that other distributions will follow soon and then even with IDE disks permanent names will not be available.

In this situation the use of UUIDs or LABELS is required for the use of partitions. However a common trend is towards using LVM for all storage, in this case LVM manages labels and UUIDs internally (with some excitement if you do a block device backup of an LVM PV). So LV names such as /dev/vg0/root then become persistent and there is no need for mounting via UUID or label.

The most difficult problem then becomes the situation where a FC SAN has the ability to create snapshots and make them visible to the same machine. UUID or label based mounting won’t work unless you can change them when creating the snapshot (which is not impossible but is rather difficult when you use a Windows GUI to create snapshots on a FC SAN for use by Linux systems). I have had some interesting challenges with this in the past when using a FC based SAN with Linux blade servers, and I never devised a good solution.

When using iSCSI I expect that it would be possible to force an association between SCSI disk naming and names on the server, but I’ve never had time to test it out.

Update: I have submitted Debian bug #489865 with a suggested change to the magic database.

Below are /etc/magic entries for displaying the UUID and label on swap spaces and ext2/3 filesystems:

Continue reading “Label vs UUID vs Device”