Donate

Categories

Advert

Forking Mon and DKIM with Mailing Lists

I have forked the “Mon” network/server monitoring system. Here is a link to the new project page [1]. There hasn’t been an upstream release since 2010 and I think we need more frequent releases than that. I plan to merge as many useful monitoring scripts as possible and support them well. All Perl scripts will use strict and use other best practices.

The first release of etbe-mon is essentially the same as the last release of the mon package in Debian. This is because I started work on the Debian package (almost all the systems I want to monitor run Debian) and as I had been accepted as a co-maintainer of the Debian package I put all my patches into Debian.

It’s probably not a common practice for someone to fork upstream of a package soon after becoming a comaintainer of the Debian package. But I believe that this is in the best interests of the users. I presume that there are other collections of patches out there and I hope to merge them so that everyone can get the benefits of features and bug fixes that have been separate due to a lack of upstream releases.

Last time I checked mon wasn’t in Fedora. I believe that mon has some unique features for simple monitoring that would be of benefit to Fedora users and would like to work with anyone who wants to maintain the package for Fedora. I am also interested in working with any other distributions of Linux and with non-Linux systems.

While setting up the mailing list for etbemon I wrote an article about DKIM and mailing lists (primarily Mailman) [2]. This explains how to setup Mailman for correct operation with DKIM and also why that seems to be the only viable option.

More KVM Modules Configuration

Last year I blogged about blacklisting a video driver so that KVM virtual machines didn’t go into graphics mode [1]. Now I’ve been working on some other things to make virtual machines run better.

I use the same initramfs for the physical hardware as for the virtual machines. So I need to remove modules that are needed for booting the physical hardware from the VMs as well as other modules that get dragged in by systemd and other things. One significant saving from this is that I use BTRFS for the physical machine and the BTRFS driver takes 1M of RAM!

The first thing I did to reduce the number of modules was to edit /etc/initramfs-tools/initramfs.conf and change “MODULES=most” to “MODULES=dep”. This significantly reduced the number of modules loaded and also stopped the initramfs from probing for a non-existant floppy drive which added about 20 seconds to the boot. Note that this will result in your initramfs not supporting different hardware. So if you plan to take a hard drive out of your desktop PC and install it in another PC this could be bad for you, but for servers it’s OK as that sort of upgrade is uncommon for servers and only done with some planning (such as creating an initramfs just for the migration).

I put the following rmmod commands in /etc/rc.local to remove modules that are automatically loaded:
rmmod btrfs
rmmod evdev
rmmod lrw
rmmod glue_helper
rmmod ablk_helper
rmmod aes_x86_64
rmmod ecb
rmmod xor
rmmod raid6_pq
rmmod cryptd
rmmod gf128mul
rmmod ata_generic
rmmod ata_piix
rmmod i2c_piix4
rmmod libata
rmmod scsi_mod

In /etc/modprobe.d/blacklist.conf I have the following lines to stop drivers being loaded. The first line is to stop the video mode being set and the rest are just to save space. One thing that inspired me to do this is that the parallel port driver gave a kernel error when it loaded and tried to access non-existant hardware.
blacklist bochs_drm
blacklist joydev
blacklist ppdev
blacklist sg
blacklist psmouse
blacklist pcspkr
blacklist sr_mod
blacklist acpi_cpufreq
blacklist cdrom
blacklist tpm
blacklist tpm_tis
blacklist floppy
blacklist parport_pc
blacklist serio_raw
blacklist button

On the physical machine I have the following in /etc/modprobe.d/blacklist.conf. Most of this is to prevent loading of filesystem drivers when making an initramfs. I do this because I know there’s never going to be any need for CDs, parallel devices, graphics, or strange block devices in a server room. I wouldn’t do any of this for a desktop workstation or laptop.
blacklist ppdev
blacklist parport_pc
blacklist cdrom
blacklist sr_mod
blacklist nouveau

blacklist ufs
blacklist qnx4
blacklist hfsplus
blacklist hfs
blacklist minix
blacklist ntfs
blacklist jfs
blacklist xfs

SE Linux in Debian/Stretch

Debian/Stretch has been frozen. Before the freeze I got almost all the bugs in policy fixed, both bugs reported in the Debian BTS and bugs that I know about. This is going to be one of the best Debian releases for SE Linux ever.

Systemd with SE Linux is working nicely. The support isn’t as good as I would like, there is still work to be done for systemd-nspawn. But it’s close enough that anyone who needs to use it can use audit2allow to generate the extra rules needed. Systemd-nspawn is not used by default and it’s not something that a new Linux user is going to use, I think that expert users who are capable of using such features are capable of doing the extra work to get them going.

In terms of systemd-nspawn and some other rough edges, the issue is the difference between writing policy for a single system vs writing policy that works for everyone. If you write policy for your own system you can allow access for a corner case without a lot of effort. But if I wrote policy to allow access for every corner case then they might add up to a combination that can be exploited. I don’t recommend blindly adding the output of audit2allow to your local policy (be particularly wary of access to shadow_t and write access to etc_t, lib_t, etc). But OTOH if you have a system that’s running in enforcing mode that happens to have one daemon with more access than is ideal then all the other daemons will still be restricted.

As for previous releases I plan to keep releasing updates to policy packages in my own apt repository. I’m also considering releasing policy source to updates that can be applied on existing Stretch systems. So if you want to run the official Debian packages but need updates that came after Stretch then you can get them. Suggestions on how to distribute such policy source are welcome.

Please enjoy SE Linux on Stretch. It’s too late for most bug reports regarding Stretch as most of them won’t be sufficiently important to justify a Stretch update. The vast majority of SE Linux policy bugs are issues of denying wanted access not permitting unwanted access (so not a security issue) and can be easily fixed by local configuration, so it’s really difficult to make a case for an update to Stable. But feel free to send bug reports for Buster (Stretch+1).

Video Mode and KVM

I recently changed my KVM servers to use the kernel command-line parameter nomodeset for the virtual machine kernels so that they don’t try to go into graphics mode. I do this because I don’t have X11 or VNC enabled and I want a text console to use with the -curses option of KVM. Without the nomodeset KVM just says that it’s in 1024*768 graphics mode and doesn’t display the text.

Now my KVM server running Debian/Unstable has had it’s virtual machines start going into graphics mode in spite of nomodeset parameter. It seems that an update to QEMU has added a new virtual display driver which recent kernels from Debian/Unstable support with the bochs_drm driver, and that driver apparently doesn’t respect nomodeset.

The solution is to create a file named /etc/modprobe.d/blacklist.conf with the contents “blacklist bochs_drm” and now my virtual machines have a usable plain-text console again! This blacklist method works for all video drivers, you can blacklist similar modules for the other virtual display hardware. But it would be nice if the one kernel option would cover them all.

Is a Thinkpad Still Like a Rolls-Royce

For a long time the Thinkpad has been widely regarded as the “Rolls-Royce of laptops”. Since 2003 one could argue that Rolls-Royce is no longer the Rolls-Royce of cars [1]. The way that IBM sold the Think business unit to Lenovo and the way that Lenovo is producing both Thinkpads and cheaper Ideapads is somewhat similar to the way the Rolls-Royce trademark and car company were separately sold to companies that are known for making cheaper cars.

Sam Varghese has written about his experience with Thinkpads and how he thinks it’s no longer the Rolls-Royce of laptops [2]. Sam makes some reasonable points to support this claim (one of which only applies to touchpad users – not people like me who prefer the Trackpoint), but I think that the real issue is whether it’s desirable to have a laptop that could be compared to a Rolls-Royce nowadays.

Support

The Rolls-Royce car company is known for great reliability and support as well as features that other cars lack (mostly luxury features). The Thinkpad marque (both before and after it was sold to Lenovo) was also known for great support. You could take a Thinkpad to any service center anywhere in the world and if the serial number indicated that it was within the warranty period it would be repaired without any need for paperwork. The Thinkpad service centers never had any issue with repairing a Thinkpad that lacked a hard drive just as long as the problem could be demonstrated. It was also possible to purchase an extended support contract at any time which covered all repairs including motherboard replacement. I know that not everyone had as good an experience as I had with Thinkpad support, but I’ve been using them since 1998 without problems – which is more than I can say for most hardware.

Do we really need great reliability from laptops nowadays? When I first got a laptop hardly anyone I knew owned one. Nowadays laptops are common. Having a copy of important documents on a USB stick is often a good substitute for a reliable laptop, when you are in an environment where most people own laptops it’s usually not difficult to find someone who will let you use theirs for a while. I think that there is a place for a laptop with RAID-1 and ECC RAM, it’s a little known fact that Thinkpads have a long history of supporting the replacement of a CD/DVD drive with a second hard drive (I don’t know if this is still supported) but AFAIK they have never supported ECC RAM.

My first Thinkpad cost $3,800. In modern money that would be something like $7,000 or more. For that price you really want something that’s well supported to protect the valuable asset. Sam complains about his new Thinkpad costing more than $1000 and needing to be replaced after 2.5 years. Mobile phones start at about $600 for the more desirable models (IE anything that runs Pokemon Go) and the new Google Pixel phones range from $1079 to $1,419. Phones aren’t really expected to be used for more than 2.5 years. Phones are usually impractical to service in any way so for most of the people who read my blog (who tend to buy the more expensive hardware) they are pretty much a disposable item costing $600+. I previously wrote about a failed Nexus 5 and the financial calculations for self-insuring an expensive phone [3]. I think there’s no way that a company can provide extended support/warranty while making a profit and offering a deal that’s good value to customers who can afford to self-insure. The same applies for the $499 Lenovo Ideapad 310 and other cheaper Lenovo products. Thinkpads (the higher end of the Lenovo laptop range) are slightly more expensive than the most expensive phones but they also offer more potential for the user to service them.

Features

My first Thinkpad was quite underpowered when compared to desktop PCs, it had 32M of RAM and could only be expanded to 96M at a time when desktop PCs could be expanded to 128M easily and 256M with some expense. It had a 800*600 display when my desktop display was 1280*1024 (37% of the pixels). Nowadays laptops usually start at about 8G of RAM (with a small minority that have 4G) and laptop displays start at about 1366*768 resolution (51% of the pixels in a FullHD display). That compares well to desktop systems and also is capable of running most things well. My current Thinkpad is a T420 with 8G of RAM and a 1600*900 display (69% of FullHD), it would be nice to have higher resolution but this works well and it was going cheap when I needed a new laptop.

Modern Thinkpads don’t have some of the significant features that older ones had. The legendary Butterfly Keyboard is long gone, killed by the wide displays that economies of scale and 16:9 movies have forced upon us. It’s been a long time since Thinkpads had some of the highest resolution displays and since anyone really cared about it (you only need pixels to be small enough that you can’t see them).

For me one of the noteworthy features of the Thinkpads has been the great keyboard. Mechanical keys that feel like a desktop keyboard. It seems that most Thinkpads are getting the rubbery keyboard design made popular by Apple. I guess this is due to engineering factors in designing thin laptops and the fact that most users don’t care.

Matthew Garrett has blogged about the issue of Thinkpad storage configured as “RAID mode” without any option to disable it [4]. This is an annoyance (which incidentally has been worked around) and there are probably other annoyances like it. Designing hardware and an OS are both complex tasks. The interaction between Windows and the hardware is difficult to get right from both sides and the people who design the hardware often don’t think much about Linux support. It has always been this way, the early Thinkpads had no Linux support for special IBM features (like fan control) and support for ISA-PnP was patchy. It is disappointing that Lenovo doesn’t put a little extra effort into making sure that Linux works well on their hardware and this might be a reason for considering another brand.

Service Life

I bought my curent Thinkpad T420 in October 2013 [5] It’s more than 3 years old and has no problems even though I bought it refurbished with a reduced warranty. This is probably the longest I’ve had a Thinkpad working well, which seems to be a data point against the case that modern Thinkpads aren’t as good.

I bought a T61 in February 2010 [6], it started working again (after mysteriously not working for a month in late 2013) and apart from the battery lasting 5 minutes and a CPU cooling problem it still works well. If that Thinkpad had cost $3,800 then I would have got it repaired, but as it cost $796 (plus the cost of a RAM upgrade) and a better one was available for $300 it wasn’t worth repairing.

In the period 1998 to 2010 I bought a 385XD, a 600E, a T21, a T43, and a T61 [6]. During that time I upgraded laptops 4 times in 12 years (I don’t have good records of when I bought each one). So my average Thinkpad has lasted 3 years. The first 2 were replaced to get better performance, the 3rd was replaced when an employer assigned me a Thinkpad (and sold it to be when I left), and 4 and 5 were replaced due to hardware problems that could not be fixed economically given the low cost of replacement.

Conclusion

Thinkpads possibly don’t have the benefits over other brands that they used to have. But in terms of providing value for the users it seems that they are much better than they used to be. Until I wrote this post I didn’t realise that I’ve broken a personal record for owning a laptop. It just keeps working and I hadn’t even bothered looking into the issue. For some devices I track how long I’ve owned them while thinking “can I justify replacing it yet”, but the T420 just does everything I want. The battery still lasts 2+ hours which is a new record too, with every other Thinkpad I’ve owned the battery life has dropped to well under an hour within a year of purchase.

If I replaced this Thinkpad T420 now it will have cost me less than $100 per year (or $140 per year including the new SSD I installed this year), that’s about 3 times better than any previous laptop! I wouldn’t feel bad about replacing it as I’ve definitely got great value for money from it. But I won’t replace it as it’s doing everything I want.

I’ve just realised that by every measure (price, reliability, and ability to run all software I want to run) I’ve got the best Thinkpad I’ve ever had. Maybe it’s not like a Rolls-Royce, but I’d much rather drive a 2016 Tesla than a 1980 Rolls-Royce anyway.

Another Broken Nexus 5

In late 2013 I bought a Nexus 5 for my wife [1]. It’s a good phone and I generally have no complaints about the way it works. In the middle of 2016 I had to make a warranty claim when the original Nexus 5 stopped working [2]. Google’s warranty support was ok, the call-back was good but unfortunately there was some confusion which delayed replacement.

Once the confusion about the IMEI was resolved the warranty replacement method was to bill my credit card for a replacement phone and reverse the charge if/when they got the original phone back and found it to have a defect covered by warranty. This policy meant that I got a new phone sooner as they didn’t need to get the old phone first. This is a huge benefit for defects that don’t make the phone unusable as you will never be without a phone. Also if the user determines that the breakage was their fault they can just refrain from sending in the old phone.

Today my wife’s latest Nexus 5 developed a problem. It turned itself off and went into a reboot loop when connected to the charger. Also one of the clips on the rear case had popped out and other clips popped out when I pushed it back in. It appears (without opening the phone) that the battery may have grown larger (which is a common symptom of battery related problems). The phone is slightly less than 3 years old, so if I had got the extended warranty then I would have got a replacement.

Now I’m about to buy a Nexus 6P (because the Pixel is ridiculously expensive) which is $700 including postage. Kogan offers me a 3 year warranty for an extra $108. Obviously in retrospect spending an extra $100 would have been a benefit for the Nexus 5. But the first question is whether new phone going to have a probability greater than 1/7 of failing due to something other than user error in years 2 and 3? For an extended warranty to provide any benefit the phone has to have a problem that doesn’t occur in the first year (or a problem in a replacement phone after the first phone was replaced). The phone also has to not be lost, stolen, or dropped in a pool by it’s owner. While my wife and I have a good record of not losing or breaking phones the probability of it happening isn’t zero.

The Nexus 5 that just died can be replaced for 2/3 of the original price. The value of the old Nexus 5 to me is less than 2/3 of the original price as buying a newer better phone is the option I want. The value of an old phone to me decreases faster than the replacement cost because I don’t want to buy an old phone.

For an extended warranty to be a good deal for me I think it would have to cost significantly less than 1/10 of the purchase price due to the low probability of failure in that time period and the decreasing value of a replacement outdated phone. So even though my last choice to skip an extended warranty ended up not paying out I expect that overall I will be financially ahead if I keep self-insuring, and I’m sure that I have already saved money by self-insuring all my previous devices.

Improving Memory

I’ve just attended a lecture about improving memory, mostly about mnemonic techniques. I’m not against learning techniques to improve memory and I think it’s good to teach kids a variety of things many of which won’t be needed when they are younger as you never know which kids will need various skills. But I disagree with the assertion that we are losing valuable skills due to “digital amnesia”.

Nowadays we have programs to check spelling so we can avoid the effort of remembering to spell difficult words like mnemonic, calendar apps on our phones that link to addresses and phone numbers, and the ability to Google the world’s knowledge from the bathroom. So the question is, what do we need to remember?

For remembering phone numbers it seems that all we need is to remember numbers that we might call in the event of a mobile phone being lost or running out of battery charge. That would be a close friend or relative and maybe a taxi company (and 13CABS isn’t difficult to remember).

Remembering addresses (street numbers etc) doesn’t seem very useful in any situation. Remembering the way to get to a place is useful and it seems to me that the way the navigation programs operate works against this. To remember a route you would want to travel the same way on multiple occasions and use a relatively simple route. The way that Google maps tends to give the more confusing routes (IE routes varying by the day and routes which take all shortcuts) works against this.

I think that spending time improving memory skills is useful, but it will either take time away from learning other skills that are more useful to most people nowadays or take time away from leisure activities. If improving memory skills is fun for you then it’s probably better than most hobbies (it’s cheap and provides some minor benefits in life).

When I was in primary school it was considered important to make kids memorise their “times tables”. I’m sure that memorising the multiplication of all numbers less than 13 is useful to some people, but I never felt a need to do it. When I was young I could multiply any pair of 2 digit numbers as quickly as most kids could remember the result. The big difference was that most kids needed a calculator to multiply any number by 13 which is a significant disadvantage.

What We Must Memorise

Nowadays the biggest memory issue is with passwords (the Correct Horse Battery Staple XKCD comic is worth reading [1]). Teaching mnemonic techniques for the purpose of memorising passwords would probably be a good idea – and would probably get more interest from the audience.

One interesting corner-case of passwords is ATM PIN numbers. The Wikipedia page about PIN numbers states that 4-12 digits can be used for PINs [2]. The 4 digit PIN was initially chosen because John Adrian Shepherd-Barron (who is credited with inventing the ATM) was convinced by his wife that 6 digits would be too difficult to memorise. The fact that hardly any banks outside Switzerland use more than 4 digits suggests that Mrs Shepherd-Barron had a point. The fact that this was decided in the 60’s proves that it’s not “digital amnesia”.

We also have to memorise how to use various supposedly user-friendly programs. If you observe an iPhone or Mac being used by someone who hasn’t used one before it becomes obvious that they really aren’t so user friendly and users need to memorise many operations. This is not a criticism of Apple, some tasks are inherently complex and require some complexity of the user interface. The limitations of the basic UI facilities become more obvious when there are operations like palm-swiping the screen for a screen-shot and a double-tap plus drag for a 1 finger zoom on Android.

What else do we need to memorise?

10 Years of Glasses

10 years ago I first blogged about getting glasses [1]. I’ve just ordered my 4th pair of glasses. When you buy new glasses the first step is to scan your old glasses to use that as a base point for assessing your eyes, instead of going in cold and trying lots of different lenses they can just try small variations on your current glasses. Any good optometrist will give you a print-out of the specs of your old glasses and your new prescription after you buy glasses, they may be hesitant to do so if you don’t buy because some people get a prescription at an optometrist and then buy cheap glasses online. Here are the specs of my new glasses, the ones I’m wearing now that are about 4 years old, and the ones before that which are probably about 8 years old:

New 4 Years Old Really Old
R-SPH 0.00 0.00 -0.25
R-CYL -1.50 -1.50 -1.50
R-AXS 180 179 180
L-SPH 0.00 -0.25 -0.25
L-CYL -1.00 -1.00 -1.00
L-AXS 5 10 179

The Specsavers website has a good description of what this means [2]. In summary SPH is whether you are log-sighted (positive) or short-sighted (negative). CYL is for astigmatism which is where the focal lengths for horizontal and vertical aren’t equal. AXS is the angle for astigmatism. There are other fields which you can read about on the Specsavers page, but they aren’t relevant for me.

The first thing I learned when I looked at these numbers is that until recently I was apparently slightly short-sighted. In a way this isn’t a great surprise given that I spend so much time doing computer work and very little time focusing on things further away. What is a surprise is that I don’t recall optometrists mentioning it to me. Apparently it’s common to become more long-sighted as you get older so being slightly short-sighted when you are young is probably a good thing.

Astigmatism is the reason why I wear glasses (the Wikipedia page has a very good explanation of this [3]). For the configuration of my web browser and GUI (which I believe to be default in terms of fonts for Debian/Unstable running KDE and Google-Chrome on a Thinkpad T420 with 1600×900 screen) I can read my blog posts very clearly while wearing glasses. Without glasses I can read it with my left eye but it is fuzzy and with my right eye reading it is like reading the last line of an eye test, something I can do if I concentrate a lot for test purposes but would never do by choice. If I turn my glasses 90 degrees (so that they make my vision worse not better) then my ability to read the text with my left eye is worse than my right eye without glasses, this is as expected as the 1.00 level of astigmatism in my left eye is doubled when I use the lens in my glasses as 90 degrees to it’s intended angle.

The AXS numbers are for the angle of astigmatism. I don’t know why some of them are listed as 180 degrees or why that would be different from 0 degrees (if I turn my glasses so that one lens is rotated 180 degrees it works in exactly the same way). The numbers from 179 degrees to 5 degrees may be just a measurement error.

Hostile Web Sites

I was asked whether it would be safe to open a link in a spam message with wget. So here are some thoughts about wget security and web browser security in general.

Wget Overview

Some spam messages are designed to attack the recipient’s computer. They can exploit bugs in the MUA, applications that may be launched to process attachments (EG MS Office), or a web browser. Wget is a very simple command-line program to download web pages, it doesn’t attempt to interpret or display them.

As with any network facing software there is a possibility of exploitable bugs in wget. It is theoretically possible for an attacker to have a web server that detects the client and has attacks for multiple HTTP clients including wget.

In practice wget is a very simple program and simplicity makes security easier. A large portion of security flaws in web browsers are related to plugins such as flash, rendering the page for display on a GUI system, and javascript – features that wget lacks.

The Profit Motive

An attacker that aims to compromise online banking accounts probably isn’t going to bother developing or buying an exploit against wget. The number of potential victims is extremely low and the potential revenue benefit from improving attacks against other web browsers is going to be a lot larger than developing an attack on the small number of people who use wget. In fact the potential revenue increase of targeting the most common Linux web browsers (Iceweasel and Chromium) might still be lower than that of targeting Mac users.

However if the attacker doesn’t have a profit motive then this may not apply. There are people and organisations who have deliberately attacked sysadmins to gain access to servers (here is an article by Bruce Schneier about the attack on Hacking Team [1]). It is plausible that someone who is targeting a sysadmin could discover that they use wget and then launch a targeted attack against them. But such an attack won’t look like regular spam. For more information about targeted attacks Brian Krebs’ article about CEO scams is worth reading [2].

Privilege Separation

If you run wget in a regular Xterm in the same session you use for reading email etc then if there is an exploitable bug in wget then it can be used to access all of your secret data. But it is very easy to run wget from another account. You can run “ssh otheraccount@localhost” and then run the wget command so that it can’t attack you. Don’t run “su – otheraccount” as it is possible for a compromised program to escape from that.

I think that most Linux distributions have supported a “switch user” functionality in the X login system for a number of years. So you should be able to lock your session and then change to a session for another user to run potentially dangerous programs.

It is also possible to use a separate PC for online banking and other high value operations. A 10yo PC is more than adequate for such tasks so you could just use an old PC that has been replaced for regular use for online banking etc. You could boot it from a CD or DVD if you are particularly paranoid about attack.

Browser Features

Google Chrome has a feature to not run plugins unless specifically permitted. This requires a couple of extra mouse actions when watching a TV program on the Internet but prevents random web sites from using Flash and Java which are two of the most common vectors of attack. Chrome also has a feature to check a web site against a Google black list before connecting. When I was running a medium size mail server I often had to determine whether URLs being sent out by customers were legitimate or spam, if a user sent out a URL that’s on Google’s blacklist I would lock their account without doing any further checks.

Conclusion

I think that even among Linux users (who tend to be more careful about security than users of other OSs) using a separate PC and booting from a CD/DVD will generally be regarded as too much effort. Running a full featured web browser like Google Chrome and updating it whenever a new version is released will avoid most problems.

Using wget when you have to reason to be concerned is a possibility, but not only is it slightly inconvenient but it also often won’t download the content that you want (EG in the case of HTML frames).

Monitoring of Monitoring

I was recently asked to get data from a computer that controlled security cameras after a crime had been committed. Due to the potential issues I refused to collect the computer and insisted on performing the work at the office of the company in question. Hard drives are vulnerable to damage from vibration and there is always a risk involved in moving hard drives or systems containing them. A hard drive with evidence of a crime provides additional potential complications. So I wanted to stay within view of the man who commissioned the work just so there could be no misunderstanding.

The system had a single IDE disk. The fact that it had an IDE disk is an indication of the age of the system. One of the benefits of SATA over IDE is that swapping disks is much easier, SATA is designed for hot-swap and even systems that don’t support hot-swap will have less risk of mechanical damage when changing disks if SATA is used instead of IDE. For an appliance type system where a disk might be expected to be changed by someone who’s not a sysadmin SATA provides more benefits over IDE than for some other use cases.

I connected the IDE disk to a USB-IDE device so I could read it from my laptop. But the disk just made repeated buzzing sounds while failing to spin up. This is an indication that the drive was probably experiencing “stiction” which is where the heads stick to the platters and the drive motor isn’t strong enough to pull them off. In some cases hitting a drive will get it working again, but I’m certainly not going to hit a drive that might be subject to legal action! I recommended referring the drive to a data recovery company.

The probability of getting useful data from the disk in question seems very low. It could be that the drive had stiction for months or years. If the drive is recovered it might turn out to have data from years ago and not the recent data that is desired. It is possible that the drive only got stiction after being turned off, but I’ll probably never know.

Doing it Properly

Ever since RAID was introduced there was never an excuse for having a single disk on it’s own with important data. Linux Software RAID didn’t support online rebuild when 10G was a large disk. But since the late 90’s it has worked well and there’s no reason not to use it. The probability of a single IDE disk surviving long enough on it’s own to capture useful security data is not particularly good.

Even with 2 disks in a RAID-1 configuration there is a chance of data loss. Many years ago I ran a server at my parents’ house with 2 disks in a RAID-1 and both disks had errors on one hot summer. I wrote a program that’s like ddrescue but which would read from the second disk if the first gave a read error and ended up not losing any important data AFAIK. BTRFS has some potential benefits for recovering from such situations but I don’t recommend deploying BTRFS in embedded systems any time soon.

Monitoring is a requirement for reliable operation. For desktop systems you can get by without specific monitoring, but that is because you are effectively relying on the user monitoring it themself. Since I started using mon (which is very easy to setup) I’ve had it notify me of some problems with my laptop that I wouldn’t have otherwise noticed. I think that ideally for desktop systems you should have monitoring of disk space, temperature, and certain critical daemons that need to be running but which the user wouldn’t immediately notice if they crashed (such as cron and syslogd).

There are some companies that provide 3G SIMs for embedded/IoT applications with rates that are significantly cheaper than any of the usual phone/tablet plans if you use small amounts of data or SMS. For a reliable CCTV system the best thing to do would be to have a monitoring contract and have the monitoring system trigger an event if there’s a problem with the hard drive etc and also if the system fails to send a “I’m OK” message for a certain period of time.

I don’t know if people are selling CCTV systems without monitoring to compete on price or if companies are cancelling monitoring contracts to save money. But whichever is happening it’s significantly reducing the value derived from monitoring.