About 24 hours ago I rebooted the system that runs the secondary DNS for my zone and a few other zones. I’d upgraded a few things and the system had been running for almost 200 days without a reboot so it was time for it. Unfortunately it didn’t come back up.
Even more unfortunately the other DNS server for my zone is ns.sws.net.au which is also the only other server for the sws.net.au zone. Normally this will work because the servers for the net.au zone have a glue record containing the server IP address. So when asked for the NS records for the sws.net.au domain the reply will include the IP address of ns.sws.net.au. The unfortunate part was that the IP address was the old IP address from before the sws.net.au servers changed to a new IP address range, I wonder whether this was due to the recovery process after the Distribute IT hack , as forgetting to change a glue record is not something that I or the other guy who runs that network would forget. But it is possible that we both stuffed up.
The DNS secondary was an IBM P3-1GHz desktop system with two IDE disks in a RAID-1 array. It’s been quite reliable, it’s been running in the same hardware configuration for about four years now with only one disk replacement. It turned out that the cooling fan in the front of the case had seized up due to a lot of dirt and the BIOS wouldn’t let the system boot in that state. Also one of the disks was reporting serious SMART problems and needed to be replaced – poor cooling tends to cause disk errors.
It seems that Compaq systems are good at informing the user of SMART problems, two different Compaq desktop systems (one from before the HP buyout and one from after) made very forceful recommendations that I replace the disk, it’s a pity that the BIOS doesn’t allow a normal boot process after the warning as following the recommendation to backup the data is difficult when the system won’t boot.
I have a temporary server running now, but my plan is to install a P3-866 system and use a 5400rpm disk to replace the 7200rpm that’s currently in the second position in the RAID array. I’ve done some tests on power use and an old P3 system uses a lot less than most new systems . Power use directly maps to heat dissipation and a full size desktop system with big fans that dissipates less than 50W is more likely to survive a poorly cooled room in summer. Laptops dissipate less heat but as their vents are smaller (thus less effective at the best of times and more likely to get blocked) this doesn’t provide a great benefit. Also my past experience of laptops as servers is that they don’t want to boot up when the lid is closed and getting RAID-1 and multiple ethernet ports on a laptop is difficult.
Finally I am going to create a third DNS server for the sws.net.au domain. While it is more pain to run extra servers, for some zones it’s just worth it.