Linux, politics, and other interesting things
It’s generally accepted that certain things need redundancy. RAID is generally regarded as essential for every server except for the corner case of compute clusters where a few nodes can go offline without affecting the results (EG the Google servers). Having redundant network cables with some sort of failover system between big switches is regarded as a good idea, and multiple links to the Internet is regarded as essential for every serious data-center and is gaining increasing acceptance in major corporate offices.
Determining whether you need redundancy for a particular part of the infrastructure is done on the basis of the cost of the redundant device (in terms of hardware and staff costs related to installing it), the cost of not having it available, and the extent to which the expected down-time will be reduced by having some redundancy.
It’s also regarded as a good idea to have more than one person with the knowledge of how to run the servers, jokes are often made about what might happen if a critical person “fell under a bus“, but more mundane things such as the desire to take an occasional holiday or a broken mobile phone can require a backup person.
One thing that doesn’t seem to get any attention is redundancy in the machine used for system administration. I’ve been using an EeePC  for supporting my clients, and it’s been working really well for me. Unfortunately I have misplaced the power supply. So I need to replace the machine (if only for the time taken to find the PSU). I have some old Toshiba Satellite laptops, they are quite light by laptop standards (but still heavier than the EeePC) and they only have 64M of RAM. But as a mobile SSH client they will do well. So my next task is to set up a Satellite as a backup machine for my network support work.
It seems that this problem is fairly widespread. I’ve worked in a few companies with reasonably large sysadmin teams. The best managed one had a support laptop that was assigned to the person who was on-call outside business hours. That laptop was not backed up (to the best of my knowledge, it was never connected to the corporate LAN so it seems that no-one had an opportunity to do so) and there was no second machine.
One thing I have been wondering is what happens to laptops with broken screens when the repair price exceeds the replacement cost. I wouldn’t mind buying an EeePC with a broken screen if it comes with a functional PSU, I could use it as a portable server.