|
|
I’ve just bought a new Thinkpad that has hardware virtualisation support and I’ve got KVM running.
HugePages
The Linux-KVM site has some information on using hugetlbfs to allow the use of 2MB pages for KVM [1]. I put “vm.nr_hugepages = 1024” in /etc/sysctl.conf to reserve 2G of RAM for KVM use. The web page notes that it may be impossible to allocate enough pages if you set it some time after boot (the kernel can allocate memory that can’t be paged and it’s possible for RAM to become too fragmented to allow allocation). As a test I reduced my allocation to 296 pages and then increased it again to 1024, I was surprised to note that my system ran extremely slow while reserving the pages – it seems that allocating such pages is efficient when done at boot time but not so efficient when done later.
hugetlbfs /hugepages hugetlbfs mode=1770,gid=121 0 0
I put the above line in /etc/fstab to mount the hugetlbfs filesystem. The mode of 1770 allows anyone in the group to create files but not unlink or rename each other’s files. The gid of 121 is for the kvm group.
I’m not sure how hugepages are used, they aren’t used in the most obvious way. I expected that allocating 1024 huge pages would allow allocating 2G of RAM to the virtual machine, that’s not the case as “-m 2048” caused kvm to fail. I also expected that the number of HugePages free according to /proc/meminfo would reliably drop by an amount that approximately matches the size of the virtual machine – which doesn’t seem to be the case.
I have no idea why KVM with Hugepages would be significantly slower for user and system CPU time but still slightly faster for the overall build time (see the performance section below). I’ve been unable to find any documents explaining in which situations huge pages provide advantages and disadvantages or how they work with KVM virtualisation – the virtual machine allocates memory in 4K pages so how does that work with 2M pages provided to it by the OS?
But Hugepages does provide a slight benefit in performance and if you have plenty of RAM (I have 5G and can afford to buy more if I need it) you should just install it as soon as you start.
I have filed Debian bug report #574073 about KVM displaying an error you normally can’t see when it can’t access the hugepages filesystem [6].
Permissions
open /dev/kvm: Permission denied
Could not initialize KVM, will disable KVM support
One thing that annoyed me about KVM is that the Debian/Lenny version will run QEMU instead if it can’t run KVM. I discovered this when a routine rebuild of the SE Linux Policy packages in a Debian/Unstable virtual machine took an unreasonable amount of time. When I halted the virtual machine I noticed that it had displayed the above message on stderr before changing into curses mode (I’m not sure the correct term for this) such that the message was obscured until the xterm was returned to the non-curses mode at program exit. I had to add the user in question to the kvm group. I’ve filed Debian bug report #574063 about this [2].
Performance
Below is a table showing the time taken for building the SE Linux reference policy on Debian/Unstable. It compares running QEMU emulation (using the kvm command but without permission to access /dev/kvm), KVM with and without hugepages, Xen, and a chroot. Xen is run on an Opteron 1212 Dell server system with 2*1TB SATA disks in a RAID-1 while the KVM/QEMU tests are on an Intel T7500 CPU in a Thinkpad T61 with a 100G SATA disk [4]. All virtual machines had 512M of RAM and 2 CPU cores. The Opteron 1212 system is running Debian/Lenny and the Thinkpad is running Debian/Lenny with a 2.6.32 kernel from Testing.
|
Elapsed |
User |
System |
| QEMU on Opteron 1212 with Xen installed |
126m54 |
39m36 |
8m1 |
| QEMU on T7500 |
95m42 |
42m57 |
8m29 |
| KVM on Opteron 1212 |
7m54 |
4m47 |
2m26 |
| Xen on Opteron 1212 |
6m54 |
3m5 |
1m5 |
| KVM on T7500 |
6m3 |
2m3 |
1m9 |
| KVM Hugepages on T7500 with NCurses console |
5m58 |
3m32 |
2m16 |
| KVM Hugepages on T7500 |
5m50 |
3m31 |
1m54 |
| KVM Hugepages on T7500 with 1800M of RAM |
5m39 |
3m30 |
1m48 |
| KVM Hugepages on T7500 with 1800M and file output |
5m7 |
3m28 |
1m38 |
| Chroot on T7500 |
3m43 |
3m11 |
29 |
I was surprised to see how inefficient it is when compared with a chroot on the same hardware. It seems that the system time is the issue. Most of the tests were done with 512M of RAM for the virtual machine, I tried 1800M which improved performance slightly (less IO means less context switches to access the real block device) and redirecting the output of dpkg-buildpackage to /tmp/out and /tmp/err reduced the built time by 32 seconds – it seems that the context switches for networking or console output really hurt performance. But for the default build it seems that it will take about 50% longer in a virtual machine than in a chroot, this is bearable for the things I do (of which building the SE Linux policy is the most time consuming), but if I was to start compiling KDE then I would be compelled to use a chroot.
I was also surprised to see how slow it was when compared to Xen, for the tests on the Opteron 1212 system I used a later version of KVM (qemu-kvm 0.11.0+dfsg-1~bpo50+1 from Debian/Unstable) but could only use 2.6.26 as the virtualised kernel (the Debian 2.6.32 kernels gave a kernel Oops on boot). I doubt that the lower kernel version is responsible for any significant portion of the extra minute of build time.
Storage
One way of managing storage for a virtual machine is to use files on a large filesystem for it’s block devices, this can work OK if you use a filesystem that is well designed for large files (such as XKS). I prefer to use LVM, one thing I have not yet discovered is how to make udev assign the KVM group to all devices that match /dev/V0/kvm-*.
Startup
KVM seems to be basically designed to run from a session, unlike Xen which can be started with “xm create” and then run in the background until you feel like running “xm console” to gain access to the console. One way of dealing with this is to use screen. The command “screen -S kvm-foo -d -m kvm WHATEVER” will start a screen session named kvm-foo that will be detached and will start by running kvm with “WHATEVER” as the command-line options. When screen is used for managing virtual machines you can use the command “screen -ls” to list the running sessions and then commands such as “screen -r kvm-unstable” to reattach to screen sessions. To detach from a running screen session you type ^A^D.
The problem with this is that screen will exit when the process ends and that loses the shutdown messages from the virtual machine. To solve this you can put “exec bash” or “sleep 200” at the end of the script that runs kvm.
start-stop-daemon -S -c USERNAME --exec /usr/bin/screen -- -S kvm-unstable -d -m /usr/local/sbin/kvm-unstable
On a Debian system the above command in a system boot script (maybe /etc/rc.local) could be used to start a KVM virtual machine on boot. In this example USERNAME would be replaced by the name of the account used to run kvm, and /usr/local/sbin/kvm-unstable is a shell script to run kvm with the correct parameters. Then as user USERNAME you can attach to the session later with the command “screen -x kvm-unstable“. Thanks to Jason White for the tip on using screen.
I’ve filed Debian bug report #574069 [3] requesting that kvm change it’s argv[0] so that top(1) and similar programs can be used to distinguish different virtual machines. Currently when you have a few entries named kvm in top’s output it is annoying to match the CPU hogging process to the virtual machine it’s running.
It is possible to use KVM with X or VNC for a graphical display by the virtual machine. I don’t like these options, I believe that Xephyr provides better isolation, I’ve previously documented how to use Xephyr [5].
kvm -kernel /boot/vmlinuz-2.6.32-2-amd64 -initrd /boot/initrd.img-2.6.32-2-amd64 -hda /dev/V0/unstable -hdb /dev/V0/unstable-swap -m 512 -mem-path /hugepages -append "selinux=1 audit=1 root=/dev/hda ro rootfstype=ext4" -smp 2 -curses -redir tcp:2022::22
The above is the current kvm command-line that I’m using for my Debian/Unstable test environment.
Networking
I’m using KVM options such as “-redir tcp:2022::22” to redirect unprivileged ports (in this case 2022) to the ssh port. This works for a basic test virtual machine but is not suitable for production use. I want to run virtual machines with minimal access to the environment, this means not starting them as root.
One thing I haven’t yet investigated is the vde2 networking system which allows a private virtual network over multiple physical hosts and which should allow kvm to be run without root privs. It seems that all the other networking options for kvm which have appealing feature sets require that the kvm process be started with root privs.
Is KVM worth using?
It seems that KVM is significantly slower than a chroot, so for a basic build environment a secure chroot environment would probably be a better option. I had hoped that KVM would be more reliable than Xen which would offset the performance loss – however as KVM and Debian kernel 2.6.32 don’t work together on my Opteron system it seems that I will have some reliability issues with KVM that compare with the Xen issues. There are currently no Xen kernels in Debian/Testing so KVM is usable now with the latest bleeding edge stuff (on my Thinkpad at least) while Xen isn’t.
Qemu is really slow, so Xen is the only option for 32bit hardware. Therefore all my 32bit Xen servers need to keep running Xen.
I don’t plan to switch my 64bit production servers to KVM any time soon. When Debian/Squeeze is released I will consider whether to use KVM or Xen after upgrading my 64bit Debian server. I probably won’t upgrade my 64bit RHEL-5 server any time soon – maybe when RHEL-7 is released. My 64bit Debian test and development server will probably end up running KVM very soon, I need to upgrade the kernel for Ext4 support and that makes KVM more desirable.
So it seems that for me KVM is only going to be seriously used on my laptop for a while.
Generally I am disappointed with KVM. I had hoped that it would give almost the performance of Xen (admittedly it was only 14.5% slower). I had also hoped that it would be really reliable and work with the latest kernels (unlike Xen) but it is giving me problems with 2.6.32 on Opteron. Also it has some new issues such as deciding to quietly do something I don’t want when it’s unable to do what I want it to do.

I’ve now had my new Thinkpad T61 [1] for almost a month. The letters on the keyboard are not even starting to wear off which is unusual, either this Thinkpad is built with harder plastic than the older ones or I’m typing more softly.
Memory
The first thing I did after receiving it was to arrange a RAM upgrade. It shipped with two 1GB DDR2 666MHz PC2-5300 SODIMM modules and as I want to run KVM I obviously need a lot more than that. The Intel Chipset page on Wikipedia is one of the resources that documents the Intel GM965 chipset as supporting up to 8G of RAM. Getting 4G in two 2G modules seemed like a bad idea as that would limit future expansion options and also result in two spare modules. So I decided to get a 4G module for a total of 5G of RAM. I’ve updated my RAM speed page with the test results of this system [2], I get 2,823MB/s with a matched pair of DIMMs and 2,023MB/s with a single DIMM. But strangely with a pair of unmatched DIMMs Memtest86+ reported 2,823MB/s – I wonder whether the first 2G of address space is interleaved for best performance and the last 3G runs at 2,023MB/s. In any case I think that losing 29% of the maximum RAM speed would be an acceptable trade-off for saving some money and I can always buy another 4G DIMM later. I had to order a DDR2-800MHz PC2-6400 module because they are cheaper than the PC2-5300 modules and my Thinkpad works equally well with either speed. I have used the spare 1G SODIMM in my EeePC701 which takes the same RAM – presumably because the EeePC designers found PC2-5300 modules to be cheaper than slower modules (I think that the 701 was at the time it was released the slowest PC compatible system that was selling in quantity). The EeePC gets only 798MB/s out of the same memory. My document about Memtest86+ results has these results and more [2].
I noticed that if I run Memtest86+ booted from a USB flash device then inserting or removing a USB device can cause memory errors, but if I boot memtest86+ from a CD it seems to work correctly. So it seems that Memtest86+ doesn’t disable some aspect of USB hardware, this might be considered a bug – or it might just be a “don’t do that” issue.
Misc
To get the hardware virtualisation working (needed to load the kvm_intel kernel module) I had to enable it in the BIOS and then do a hard reset (power off). Telling the BIOS to save and reboot was not adequate. This would be a BIOS bug, it knew that I had changed the virtualisation setting so it should have either triggered a hard reset or instructed me to do so.
The default configuration of Debian/Lenny results in sound not working, I had to run alsaconf as suggested on the Debian Etch on Thinkpad T61 howto [3] which solved it.
Generally I’m happy with this system, the screen resolution is 1680*1050 which has 20% more pixels than the 1400*1050 screen on my Thinkpad T41p, it’s a lot faster for CPU operations and should be a lot faster for video when I get the drivers sorted out (currently it’s a lot slower), and I have virtualisation working again. But when you buy a system that’s much like the last one but 6 years newer you expect it to be better.
Generally the amount of effort involved in the process of buying a new system, upgrading the RAM to the desired specs, installing Linux and tweaking all the options is enough to make me want to wait at least another 6 years before buying another. Part of the reason for this difficulty is that I want to get so much functionality from the machine, a machine with more modest goals (such as a Netbook) takes a lot less time to configure.
Problems
There is Bluetooth hardware which is apparently enabled by default. But a quick search didn’t turn up any information on how to do the basic functions, I would like to just transfer files from my mobile phone in the same way that I transfer files between phones.
The video card is a nVidia Corporation Quadro NVS 140M (rev a1). 3D games seem slow but glxgears reports 300fps. It doesn’t have Xvideo support which appears to be the reason why mplayer won’t allow resizing it’s display area unless run with the -zoom option, and it’s also got performance problems such that switching between virtual desktops will interrupt the sound on a movie that mplayer is playing – although when alsaplayer is playing music the sound isn’t interrupted. Also when I play a Youtube video at twice the horizontal and vertical resolution it takes half of one CPU core. It’s a pity that I didn’t get an Intel video controller.
It seems that Debian is soon going to get the Nouveau NVidia drivers so hopefully video performance will improve significantly when I get them [4].
The next thing I have to do is to get the sound controls working. The older Thinkpads that I used had hardware controls, the T41p that was my previous system had buttons for increasing and decreasing the volume and a mute button that interacted directly with the hardware. The down-side of this was that there was no way for the standard software to know what the hardware was going to do, the up-side was that I could press the mute button and know that it would be silent regardless of what the software wants. Now I have the same buttons on my T61 but they don’t do anything directly, they just provide key-press events. According to showkeys the mute key gives “0x71 0xf1“, the volume down button gives “0x72 0xf2“, and the volume up button gives “0x73 0xf3“. Daniel Pittman has made some suggestions to help me get the keyboard events mapped to actions that can change the sound via software [5] – which I haven’t yet had time to investigate. I wonder if it will ever be possible to change the volume of the system beep.
The system has an SD card slot, but that doesn’t seem to work. I’m not really worried at the moment but in the future I will probably try and get it going. It has a 100G disk which isn’t that big, adding a 32G SD card at some future time might be the easiest way to upgrade the storage – copying 100G of data is going to be painful and usually a small increment in storage capacity can keep a system viable for a while.
Any advice on getting sound, the SD card, and Bluetooth working would be appreciated. I’ll probably upgrade to Debian/Testing in the near future so suggestions that require testing features won’t be ruled out.

Some time ago Yubico were kind enough to send me an evaluation copy of their Yubikey device. I’ve finally got around to reviewing it and making deployment plans for buying some more. Above is a picture of my Yubikey on the keyboard of my Thinkpad T61 for scale. The newer keys apparently have a different color in the center of the circular press area and also can be purchased in white plastic.
The Yubikey is a USB security token from Yubico [1]. It is a use-based token that connects via the USB keyboard interface (see my previous post for a description of the various types of token [2]). The Yubikey is the only device I know of which uses the USB keyboard interface, it seems that this is an innovation that they invented. You can see in the above picture that the Yubikey skips the metal that is used to surround most USB devices, this probably fails to meet some part of the USB specification but does allow them to make the key less than half as thick as it might otherwise be. Mechanically it seems quite solid.
The Yubikey is affordable, unlike some token vendors who don’t even advertise prices (if you need to ask then you can’t afford it) and they have an online sales site. $US25 for a single key and discounts start when you buy 10. As it seems quite likely that someone who wants such a token will want at least two of them for different authentication domains, different users in one home, or as a backup in case one is lost or broken (although my experiments have shown that Yubikeys are very hardy and will not break easily). The discount rate of $20 will apply if you can find four friends who want to use them (assuming two each), or if you support several relatives (as I do). The next discount rate of $15 applies when you order 100 units, and they advise that customers contact their sales department directly if purchasing more than 500 units – so it seems likely that a further discount could be arranged when buying more than 500 units. They accept payment via Paypal as well as credit cards. It seems to me that any Linux Users Group could easily arrange an order for 100 units (that would be 10 people with similar needs to me) and a larger LUG could possibly arrange an order of more than 500 units for a better discount. If an order of 500 can’t be arranged then an order of 200 would be a good thing to get half black keys and half white ones – you can only buy a pack of 100 in a single color.
There is a WordPress plugin to use Yubikey authentication [3]. It works, but I would be happier if it had an option to accept a Yubikey OR a password (currently it demands both a Yubikey AND a password). I know that this is less secure, but I believe that it’s adequate for an account that doesn’t have administrative rights.
To operate the Yubikey you just insert it into a USB slot and press the button to have it enter the pass code via the USB keyboard interface. The pass code has a prefix that can be used to identify the user so it can replace both the user-name and password fields – of course it is technically possible to use one Yubikey for authentication with multiple accounts in which case a user-name would be required. Pressing the Yubikey button causes the pass code to be inserted along with the ENTER key, this can take a little getting used to as a slow web site combined with a habit of pressing ENTER can result in a failed login (at least this has happened to me with Konqueror).
As the Yubikey is use-based, it needs a server to track the usage count of each key. Yubico provides source to the server software as well as having their own server available on the net – obviously it might be a bad idea to use the Yubico server for remote root access to a server, but for blog posting that is a viable option and saves some effort.
If you have multiple sites that may be disconnected then you will either need multiple Yubikeys (at a cost of $20 or $15 each) or you will need to have one Yubikey work with multiple servers. Supporting a single key with multiple authentication servers means that MITM attacks become possible.
The full source to the Yubikey utilities is available under the new BSD license. In Debian the base functionality of talking to the Yubikey is packaged as libyubikey0 and libyubikey-dev, the server (for validating Yubi requests via HTTP) is packaged as yubikey-server-c, and the utility for changing the AES key to use your own authentication server is packaged as yubikey-personalization – thanks Tollef Fog Heen for packaging all this!
The YubiPAM project (a PAM module for Yubikey) is licensed under the GPL [4]. It would be good if this could be packaged for Debian (unfortunately I don’t have time to adopt more packages at the moment).
There is a new model of Yubikey that has RFID support. They suggest using it for public transport systems where RFID could be used for boarding and the core Yubikey OTP functionality could be used for purchasing tickets. I don’t think it’s very interesting for typical hobbyist and sysadmin work, but RFID experts such as Jonathan Oxer might disagree with me on this issue.
The Security Token Wikipedia page doesn’t seem to clearly describe the types of token.
Categories of Security Token
It seems to me that the following categories encompass all security tokens:
- Biometric tokens – which seems rather pointless to me. Having a device I control verify my biometric data doesn’t seem to provide a benefit. The only possible benefit seems to be if the biometric token verifies the identity of the person holding it before acting as one of the other types of token.
- Challenge-response devices. The server will send a challenge (usually a random number) and will expect a response (usually some sort of cryptographically secure hash of the number and a shared secret). A challenge-response device may take a password from the user and combine it with the challenge from the server and the shared secret when calculating the response.
- Time-based tokens (one-time passwords). They will provide a new pseudo-random number that changes periodically, often a 30 second time interval is used and the number is presumably a cryptographically secure hash of the time and a shared secret. This requires a battery in the token and the token will become useless when the battery runs out. It also requires that the server have an accurate clock.
- Use-based tokens. They will give a new pseudo-random number every time a button is pressed (or some other event happens to indicate that a number has been used). These do not work well if you have multiple independent servers and an untrusted network.
Here is my analysis of the theory of token use, note that I am not sure how the implementations of the various token systems deal with these issues.
- Biometric security seems like a bad idea for most non-government use. I have seen a retina scanner in use at a government office – that made some sense as the people being scanned were in a secure area (they had passed some prior checks) and they were observed (to prevent coercion and the use of fake eyes). Biometric authentication for logging in over the net just seems like a bad idea as you will never know if you can trust the scanner.
- It seems to me that challenge-response devices are by far the most secure option. CR is resistant to replay attacks provided that it is not possible to have re-used challenges. If the calculation of the response includes a password (which is performed on some tokens that resemble pocket calculators) then a CR token will meet the “something you have and something you know” criteria.
One potential problem with CR systems is that of not including the server or account ID in the calculation. So if I was to use a terminal in an insecure location to login to a server or account with data that is not particularly important then it would be possible for an attacker who had compromised the terminal to perform a Man In The Middle (MITM) attack against other servers. Of course you are supposed to use a different password for each account, if you do this then a CR token that includes a password will be resistant to this attack – but I expect that people who use tokens are more likely to use one password for multiple accounts.
- Time-based tokens have a weakness in that an attacker who can immediately discover the number used for one connection could then immediately login to other servers. One example of a potential attack using this avenue would be to compromise a terminal in an Internet cafe, steal a hash used for logging in to server A and then immediately login to server B. This means that it may not be safe to use the same token for logging in to servers (or accounts) that have different sensitivity levels unless a strong password was used as well – I expect that people who have hardware tokens tend to use weaker passwords.
Also one factor that will make some MITM attacks a lot easier is the fact that the combination of the hash from the token and the password are valid for a period of time so an attacker could establish a second connection within the 30 second interval. It seems that only allowing one login with a particular time-coded password is the correct thing to do, but this may be impossible if multiple independent servers use the same token. Time based tokens expire when the battery runs out. The measures taken to make them tamper-proof may make it difficult or impossible to replace the battery so a new token must be purchased every few years.
- Use-based tokens are very similar to time-based tokens, it’s just a different number that is used for generating the hash. The difference in the token is that a time-based token needs a battery so that it can track the time while a use-based token needs a small amount of flash memory to store the usage count. The difference at the server end is that for a use-based token the server needs a database of the use count of each token, which is usually not overly difficult for a single server.
One problem is the case of a restore from backup of the server which maintains the use count database. The only secure way of managing this is to either inspect every token (to discover it’s use count) or to issue a new password (for using password plus token authentication). Either option would be really painful if you have many users at remote sites. Also it would be required to get the database transaction committed to disk before an authentication attempt is acknowledged so that a server crash could not lose the count – this should be obvious but many people get these things wrong. An additional complication for use-based tokens comes with the case of a token that is used for multiple servers. One server needs to maintain the database of the usage counts and the other servers need to query it by secure links. If a login attempt with use count 100 has been made to server A then server B must not accept a login with a hash that has a use count less than or equal to 100. This is firstly to cover the case where a MITM attack is used to login to server B with credentials that were previously used for server A. The second aim of this is to cover the case where a token that is temporarily unguarded is used to generate dozens of hashes – while the hashes could be immediately used it is desirable to have them expire as soon as possible, and having the next login update the use count and invalidate such hashes is a good thing. The requirement that all servers know the current use count requires that they all trust a single server. In some situations this may not be possible, so it seems that this only works for servers within a single authentication domain or for access to less important data.
Methods of Accessing Tokens
It seems that the following are the main ways of accessing tokens.
- Manual entry – the user reads a number from an LCD display and types it in. This is the most portable and often the cheapest – but does require a battery.
- USB keyboard – the token is connected to a PC via the USB port and reports itself as a keyboard. It can then enter a long password when a button is pressed. This is done by the Yubikey [1], I am not aware of anyone else doing it. It would be possible to have a token connect as a USB keyboard and also have it’s own keypad for entry of a password and a challenge used for CR authentication.
- USB (non-keyboard) or PCCard/Cardbus (AKA PCMCIA). The token has it’s own interface and needs a special driver for the OS in question. This isn’t going to work if using an Internet cafe or an OS that the token vendor doesn’t support.
- Bluetooth/RFID – has similar problems to the above but also can potentially be accessed by hostile machines without the owner knowing. I wouldn’t want to use this.
- SmartCard – the card reader connects to the PC via Cardbus or USB and it has all the same driver issues. Some SmartCard devices are built in to a USB device that looks to the OS like a card reader plus a card, so it’s a USB interface with SmartCard protocols.
To avoid driver issues and allow the use on random machines it seems that the USB keyboard and manual entry types of token are best. While for corporate Intranet use it seems that a SmartCard is best as it can be used for opening doors as well, you could use a USB keyboard token (such as a Yubikey) to open doors – but it would be slower and there is no off the shelf hardware.
For low cost and ease of implementation it seems that use-based tokens that connect via the USB keyboard interface are best. For best security it seems that a smart-card or USB interface to a device with a keypad for entering a password is ideal.

Brendan Scott linked to a couple of articles about CAL (the Copyright Agency Limited) [1]. I have previously written about CAL and the way that they charge organisations for the work of others without their consent [2]. My personal dispute with CAL is that they may be charging people to use my work, I have not given them permission to act on my behalf and will never do so. If they ever bill anyone for my work then it will be an act of piracy. The fact that the government through some bad legislation permitted them to do such things doesn’t prevent it from being piracy – you can’t disagree with this claim without supporting the past actions of China and other countries that have refrained from preventing factories from mass-producing unauthorised copies of software.
The first article concerns the fact that last year CAL paid more than $9,400,000 in salary to it’s employees (including $350,000 to it’s CEO) while it only paid $9,100,000 directly to the authors [3]. It also spent another $300,000 to send it’s executives to a junket in Barbados. It did give $76,000,000 to publishers “on the assumption that a proportion of this money will be returned to authors” – of course said publishers could have used the money to have holidays in Barbados. CAL doesn’t bother to check who ends up with shares of the $76,000,000 so it’s anyone’s guess where it ends up.
The second article is by James Bradley who is an author and director of CAL [4]. He claims that “much” of the $76,000,000 was distributed to authors, although I’m not sure how he would have any idea of how much it was – which is presumably why he used the word “much” instead of some other word with a clearer meaning such as “most“. He also notes that CAL invested $1,000,000 in “projects specifically designed to promote the development and dissemination of Australian writing“, which sounds nice until you consider the fact that none of the authors (apart from presumably the few who sit on the CAL board) had any say in the matter. Can I take a chunk of the $9,400,000 that is paid to CAL employees and invest it in something? If not then why not? If they can “invest” money that was owed to other people then why can’t I invest their salaries?
James also says “The issue of how well CAL serves rights-holders – and authors and artists in particular – is a vital one” which is remarkably silly. He is entirely ignoring the fact that some rights holders don’t want to be “served” by CAL at all. The fact that CAL can arbitrarily take money for other people’s work is an infringement on their rights. He further demonstrates his ignorance by saying “Without CAL and the licences we administer, users – educational institutions, government agencies and corporate organisations, to name just a few – would be required to seek permission every time they reproduced copyright material or run the risk of legal action for copyright infringement” – of course any educational institution can use Creative Commons licensed work [5].
I’ve previously written about the CK12 project to develop CC licensed text books for free use [6]. There’s no reason why the same thing can’t be done for university text books. In the discussion following Claudine Chionh’s LUV talk titled “Humanities computing, Drupal and What I did on my holidays” [7] it was suggested that it should be possible to gain credit towards a post-graduate degree based on work done to share information – this could mean setting up a Drupal site and populating the database or it could mean contributing to CC licensed text books. Let’s face it, a good CC text book will be read by many more people than the typical conference proceedings!
James says that CAL is used “Instead of having to track down individual rights-holders every time they want to reproduce copyright material“. The correct solution to this problem would be to change the copyright law such that if a reasonable attempt to discover the rights-holder fails then work is deemed to be in the public domain. The solution to the problem of tracking down rights-holders is not to deny them their rights entirely and grant CAL the right to sub-license their work!
He also makes the ridiculous claim “Whereas in the age of the physical book schools and universities could have bought fewer books and made up the difference by using photocopies, it is now possible for an organisation to buy a single set of digital materials and reproduce them ad infinitum” which implies that CAL is the only thing saving the profits of authors from unrestricted digital copying. Of course as CAL seems to have no active enforcement mechanisms and they apparently charge a per-student fee they really have no impact on the issue of a single licensed copy being potentially used a million times – extra use apparently won’t provide benefits to the author and use in excess of the licensing scheme won’t be penalised.
He asks the rhetorical question “After all, why go to the expense of creating a textbook (or some form of digital course materials) if you are going to sell only a half-dozen copies to state education departments“. The answer is obvious to anyone who has real-world experience with multiple licensing schemes – you can sell one single copy and make a profit if the price is high enough. The smart thing for the education departments to do would be to pool their resources and pay text book companies for writing CC licensed texts (or releasing previously published texts under the CC). The average author of a text book would probably be very happy to earn $100,000 for their work, the editorial process probably involves a similar amount of work. So if the government was to offer $300,000 for the entire rights to a text book then I’m sure that there would be more than a few publishers tendering for the contract.
According to the CIA World Fact Book there are 2,871,482 people in Australia aged 0-14 [8], that means about 205,000 per year level. CAL charges $16 for each primary and secondary student so the government is paying about $3,280,000 every year per year level. Even in year 12 the number of text books used is probably not more than 10, so it seems to me that if all the money paid to CAL by schools in a single year was instead used to fund Creative Commons licensed text books then the majority of the school system would be covered! The universities have a much wider range of text books but they also have higher CAL fees of $40 per student. After cutting off the waste of taxpayer money on CAL fees for schools that money could be invested in the production of CC licensed university text books.
The Threat
Bruce Schneier’s blog post about the Mariposa Botnet has an interesting discussion in the comments about how to make a secure system [1]. Note that the threat is considered to be remote attackers, that means viruses and trojan horses – which includes infected files run from USB devices (IE you aren’t safe just because you aren’t on the Internet). The threat we are considering is not people who can replace hardware in the computer (people who have physical access to it which includes people who have access to where it is located or who are employed to repair it). This is the most common case, the risk involved in stealing a typical PC is far greater than the whatever benefit might be obtained from the data on it – a typical computer user is at risk of theft only for the resale value of a second-hand computer.
So the question is, how do can we most effectively use free software to protect against such threats?
The first restriction is that the hardware in common use is cheap and has little special functionality for security. Systems that have a TPM seem unlikely to provide a useful benefit due to the TPM being designed more for Digital Restrictions Management than for protecting the user – and due to TPM not being widely enough used.
The BIOS and the Bootloader
It seems that the first thing that is needed is a BIOS that is reliable. If an attacker manages to replace the BIOS then it could do exciting things like modifying the code of the kernel at boot time. It seems quite plausible for the real-mode boot loader code to be run in a VM86 session and to then have it’s memory modified before it starts switches to protected mode. Every BIOS update is a potential attack. Coreboot replaces the default PC BIOS, it initialises the basic hardware and then executes an OS kernel or boot loader [2] (the Coreboot Wikipedia page has a good summary). The hardest part of the system startup process is initialising the hardware, Coreboot has that solved for 213 different motherboards.
If engineers were allowed to freely design hardware without interference then probably a significant portion of the computers in the market would have a little switch to disable the write line for the flash BIOS. I heard a rumor that in the days of 286 systems a vendor of a secure OS shipped a scalpel to disable the hardware ability to leave protected mode, cutting a track on the motherboard is probably still an option. Usually once a system is working you don’t want to upgrade the BIOS.
One of the payloads for Coreboot is GRUB. The Grub Feature Requests page has as it’s first entry “Option to check signatures of the bootchain up to the cryptsetup/luksOpen: MBR, grub partition, kernel, initramfs” [3]. Presumably this would allow a GPG signature to be checked so that a kernel and initrd would only be used if they came from a known good source. With this feature we could only boot a known good kernel.
How to run User Space
The next issue is how to run the user-space. There has been no shortage of Linux kernel exploits and I think it’s reasonable to assume that there will continue to be a large number of exploits. Some of the kernel flaws will be known by the bad guys for some time before there are patches, some of them will have patches which don’t get applied as quickly as desired. I think we have to assume that the Linux kernel will be compromised. Therefore the regular user applications can’t be run against a kernel that has direct hardware access.
It seems to me that the best way to go is to have the Linux kernel run in a virtual environment such as Xen or KVM. That means you have a hypervisor (Xen+Linux or Linux+KVM+QEMU) that controls the hardware and creates the environment for the OS image that the user interacts with. The hypervisor could create multiple virtual machines for different levels of data in a similar manner to the NSA NetTop project, not that this is really a required part of solving the general secure Internet terminal problem but as it would be a tiny bit of extra work you might as well do it.
One problem with using a hypervisor is that the video hardware tends to want to use features such as bus-mastering to give best performance. Apparently KVM has IOMMU support so it should be possible to grant a virtual machine enough hardware access to run 3D graphics at full speed without allowing it to break free.
Maintaining the Virtual Machine Image
Google has a good design for the ChromiumOS in terms of security [4]. They are using CGroups [5] to control access to device nodes in jails, RAM, CPU time, and other resources. They also have some intrusion detection which can prompt a user to perform a hardware reset. Some of the features would need to be implemented in a different manner for a full desktop system but most of the Google design features would work well.
For an OS running in a virtual machine when an intrusion is detected it would be best to have the hypervisor receive a message by some defined interface (maybe a line of text printed on the “console”) and then terminate and restart the virtual machine. Dumping the entire address space of the virtual machine would be a good idea too, with typical RAM sizes at around 4G for laptops and desktops and typical storage sizes at around 200G for laptops and 2T for new desktops it should be easy to store a few dumps in case they are needed.
The amount of data received by a typical ADSL link is not that great. Apart from the occasional big thing (like downloading a movie or listening to Internet radio for a long time) most data transfers are from casual web browsing which doesn’t involve that much data. A hypervisor could potentially store the last few gigabytes of data that were received which would then permit forensic analysis if the virtual machine was believed to be compromised. With cheap SATA disks in excess of 1TB it would be conceivable to store the last few years of data transfer (with downloaded movies excluded) – but such long-term storage would probably involve risks that would outweigh the rewards, probably storing no more than 24 hours of data would be best.
Finally in terms of applying updates and installing new software the only way to do this would be via the hypervisor as you don’t want any part of the virtual machine to be able to write to it’s data files or programs. So if the user selects to install a new application then the request “please install application X” would have to be passed to the hypervisor. After the application is installed a reboot of the virtual machine would be needed to apply the change. This is a common experience for mobile phones (where you even have to reboot if the telco changes some of their network settings) and it’s something that MS-Windows users have become used to – but it would get a negative reaction from the more skilled Linux users.
Would this be Accepted?
The question is, if we built this would people want to use it? The NetTop functionality of having two OSs interchangeable on the one desktop would attract some people. But most users don’t desire greater security and would find some reason to avoid this. They would claim that it lowered the performance (even for aspects of performance where benchmarks revealed no difference) and claim that they don’t need it.
At this time it seems that computer security isn’t regarded as a big enough problem for users. It seems that the same people who will avoid catching a train because one mugging made it to the TV news will happily keep using insecure computers in spite of the huge number of cases of fraud that are reported all the time.
In a comment on my post Shared Objects and Big Applications about memlockd [1] mic said that they use memlockd to lock the entire root filesystem in RAM. Here is a table showing my history of desktop computers with the amounts of RAM, disk capacity, and CPU power available. All systems better than a 386-33 are laptops – a laptop has been my primary desktop system for the last 12 years. The columns for the maximum RAM and disk are the amounts that I could reasonably afford if I used a desktop PC instead of a laptop and used the best available technology of the day – I’m basing disk capacity on having four hard drives (the maximum that can be installed in a typical PC without extra power cables and drive controller cards) and running RAID-5. For the machines before 2000 I base the maximum disk capacity on not using RAID as Linux software RAID used to not be that good (lack of online rebuild for starters) and hardware RAID options have always been too expensive or too lame for my use.
| Year |
CPU |
RAM |
Disk |
Maximum RAM |
Maximum Disk |
| 1988 |
286-12 |
4M |
70M |
4M |
70M |
| 1993 |
386-33 |
16M |
200M |
16M |
200M |
| 1998 |
Pentium-M 233 |
96M |
3G |
128M |
6G |
| 1999 |
Pentium-2 400 |
256M |
6G |
512M |
40G |
| 2000 |
Pentium-2 600 |
384M |
10G |
512M |
150G |
| 2003 |
Pentium-M 1700 |
768M |
60G |
2048M |
400G |
| 2009 |
Pentium-M 1700 |
1536M |
100G |
8192M |
4500G |
| 2010 |
Core 2 Duo T7500 2200 |
5120M |
100G |
8192M |
6000G |

The above graph shows how the modern RAM capacities have overtaken older disk capacities. So it seems that a viable option on modern systems is to load everything that you need to run into RAM. Locking it there will save spinning up the hard drive on a laptop. With a modern laptop it should be possible to lock most of the hard drive contents that are regularly used (IE the applications) into RAM and run with /home on a SD flash storage device. Then the hard drive would only need to be used if something uncommon was accessed or if something large (like a movie) was needed. It also shows that there is potential to run diskless workstations that copy the entire contents of their root filesystem when they boot so that they can run independently of the server and only access the server for /home.
Note that the size of the RAM doesn’t need to be larger than the disk capacity of older machines (some of the disk was used for swap, /home, etc). But when it is larger it makes it clear that the disk doesn’t need to be accessed for routine storage needs.
I generated the graph with GnuPlot [2], the configuration files I used are in the directory that contains the images and the command used was “gnuplot command.txt“. I find the GnuPlot documentation to be difficult to use so I hope that this example will be useful for other people who need to produce basic graphs – I’m not using 1% of the GnuPlot functionality.
The Opera-Mini Dispute
I have just read an interesting article about the Opera browser [1]. The article is very critical of Opera-Mini on the iPhone for many reasons – most of which don’t interest me greatly. There are lots of technical trade-offs that you can make when designing an application for a constrained environment (EG a phone with low resolution and low bandwidth).
What does interest me is the criticism of the Opera Mini browser for proxying all Internet access (including HTTPS) through their own servers, this has been getting some traction around the Internet. Now it is obvious that if you have one server sitting on the net that proxies connections to lots of banks then there will be potential for abuse. What apparently isn’t obvious to as many people is the fact that you have to trust the application.
Causes of Software Security Problems
When people think about computer security they usually think about worms and viruses that exploit existing bugs in software and about Trojan horse software that the user has to be tricked into running. These are both significant problems.
But another problem is that of malicious software releases. I think that this is significantly different from Trojan horses because instead of having an application which was written for the sole purpose of tricking people (as is most similar to Greek history) you have an application that was written by many people who genuinely want to make a good product but you have a single person or small group that hijacks it.
Rumor has it that rates well in excess of $10,000 are sometimes paid for previously unknown security vulnerabilities in widely used software. It seems likely that a programmer who was in a desperate financial situation could bolster their salary by deliberately putting bugs in software and then selling the exploits, this would not be a trivial task (making such bugs appear to be genuine mistakes would take some skill) – but there are lots of people who could do it and plausibly deny any accusation other than carelessness. There have been many examples of gambling addicts who have done more foolish things to fund their habit.
I don’t think it’s plausible to believe that every security flaw which has been discovered in widely used software was there purely as the result of a mistake. Given the huge number of programmers who have the skill needed to deliberately introduce a security flaw into the source of a program and conceal it from their colleagues I think it’s quite likely that someone has done so and attempted to profit from it.
Note that even if it could be proven that it was impossible to profit from creating a security flaw in a program that would not be sufficient to prove that it never happened. There is plenty of evidence of people committing crimes in the mistaken belief that it would be profitable for them.
Should We Trust a Proprietary Application or an Internet Server?
I agree with the people who don’t like the Opera proxy idea, I would rather run a web browser on my phone that directly accesses the Internet. But I don’t think that the web browser that is built in to my current smart-phone is particularly secure. It seems usual for a PC to need a security update for the base OS or the web browser at least once a year while mobile phones have a standard service life of two years without any updates. I suspect that there is a lot of flawed code running on smart phones that never get updated.
It seems to me that the risks with Opera are the single point of failure of the proxy server in addition to the issues of code quality while the risks with the browser that is on my smart-phone is just the quality of the code. I suspect that Opera may do a better job of updating their software to fix security issues so this may mitigate the risk from using their proxy.
At the moment China is producing a significant portion of the world’s smart-phones. Some brands like LG are designed and manufactured in China, others are manufactured in China for marketing/engineering companies based in Europe and the US. A casual browse of information regarding Falun Gong makes the character of the Chinese leadership quite clear [2], I think that everything that comes out of China should be considered to be less trustworthy than equivalent products from Europe and the US. So I think that anyone who owns a Chinese mobile phone and rails against the Opera Mini hasn’t considered the issue enough.
I don’t think it’s possible to prove that an Opera Mini with it’s proxy is more or less of a risk than a Chinese smart-phone. I’m quite happy with my LG Viewty [3] – but I wouldn’t use it for Internet banking or checking my main email account.
Also we have to keep in mind that mobile phones are really owned by telephone companies. You might pay for your phone or even get it “unlocked” so you can run it on a different network, but you won’t get the custom menus of your telco removed. Most phones are designed to meet the needs of telcos not users and I doubt that secure Internet banking is a priority for a telco.
Update: You can buy unlocked mobile phones. But AFAIK the Android is the only phone which might be described as not being designed for the needs of the telcos over the needs of the users. So while you can get a phone without custom menus for a telco, you probably can’t get a phone that was specifically designed for what you want to do.
The Scope of the Problem
Mobile phones are not the extent of the problem, I think that anyone who buys a PC from a Chinese manufacturer and doesn’t immediately wipe the hard drive and do a fresh OS install is taking an unreasonable risk. The same thing goes for anyone who buys a PC from a store where it’s handled by low wage employees, I can imagine someone on a minimum income accepting a cash payment to run some special software on every PC before it goes out the door – that wouldn’t be any more difficult or risky than the employees who copy customer credit card numbers (a reasonably common crime).
It’s also quite conceivable that any major commercial software company could have a rogue employee who is deliberately introducing bugs into it’s software. That includes Apple. If the iPhone OS was compromised before it shipped then the issue of browser security wouldn’t matter much.
I agree that having the minimum possible number of potential security weak points is a good idea. They should allow Opera Mini users to select that HTTPS traffic should not be proxied. But I don’t think that merely not using a proxy would create a safe platform for Internet banking. In terms of mobile phones most things are done in the wrong way to try and get more money out of the users. Choose whichever phone or browser you want and it will probably still be a huge security risk.
Harald Welte is doing some really good work on developing free software for running a GSM network [4]. But until that project gets to the stage of being widely usable I think that we just have to accept a certain level of security risk when using mobile phones.

Diagnosis
A few weeks ago I was referred to a specialist for the treatment of Carpal Tunnel Syndrome. I first noticed the symptoms in early January, it started happening at night with a partial numbness in the fingers of my left hand. I didn’t think much of it at first as it’s the expected symptom of sleeping in a position that reduces the blood flow. But when it kept happening with my left hand and never happening with my right and then started getting worse (including happening during the day) I sought medical advice.
The doctor asked me to bend my hand down (as if trying to touch my left elbow with the fingers of my left hand). Within about 10 seconds this caused numbness – this result from bending one’s wrist is a major symptom of CTS.
Treatment
On Thursday I saw a specialist about this, she agreed with the GP’s diagnosis and made a wrist brace for me. She started by cutting off a length of a tube of elastic woven material (similar to a sock) and then cutting a thumb hole, that became the lining. Then to make the hard part she put a sheet of plastic in an electric saucepan (which had water simmering) until it started to melt and then used a spatula to fish it out. The melting temperature of the plastic wasn’t that high (it was soft at about 50C when she put it on my arm), it wasn’t at all sticky when it was partially melted, and it didn’t seem to conduct heat well.
After wearing the wrist brace non-stop for a few days I have noticed an improvement already. Hopefully I will make a full recovery in a matter of a month or so, I will probably have to wear a wrist brace when sleeping for the rest of my life, but that’s no big deal – it’s more comfortable to sleep with a wrist brace than a partially numb hand. I’ve also been prescribed a set of exercises to help remove scar tissue from the nerves. I haven’t done them much yet.
In terms of being annoying, the wrist-brace has 3mm diameter holes in a square grid pattern with a 25mm spacing. This doesn’t let much air through and in warm weather my arm gets sweaty and starts to itch. I’m thinking of drilling some extra holes to alleviate this – the part which makes my arm itch doesn’t need much mechanical strength. The only task which has been really impeded has been making peanut butter sandwiches, maybe it was making sandwiches not typing that caused CTS? ;) In any case I’m not giving up typing but I would consider giving up peanut butter sandwiches.
I really hope to avoid the surgical option, it doesn’t seem pleasant at all.
Other
One final thing to note is that Repetitive Strain Injury (RSI) is entirely different. RSI is a non-specific pain associated with repetitive tasks while CTS is a specific problem in one or two nerves where they go through the wrist. RSI apparently tends to reduce the effective strength of the afflicted limb, while milder cases of CTS (such as mine) cause no decrease in strength – of course a severe case of CTS results in muscle atrophy due to reduced nerve signals, but I shouldn’t ever get that. Many people think that RSI and CTS are essentially the same thing – I used to think that until a few weeks ago when I read the Wikipedia pages in preparation to seeing a doctor about it.
I want to obtain some of the plastic that was used to make my wrist brace, it could be really handy to have something that convenient for making boxes, containers, and supports for various things – among other things it doesn’t appear to generate static. The low melting temperature will prevent certain computer uses (the hot air that comes out of a cooling system for a high-end CPU would probably melt it), but it could probably be used to make the case for an Opteron system with careful design. I’m guessing that the cost of the plastic is a very small portion of the $150 I paid to the specialist so it shouldn’t be that expensive – and I’m sure it would be even cheaper if it wasn’t bought from a medical supply store. If I ever get time to do some work on an Arduino system or something similar then I will definitely try to get some of this plastic for the case.
Also the Wikipedia page has a picture of what appears to be a mass-produced wrist brace. I think that it might be improved by having the picture of the custom one that I wear added to the page. I unconditionally license the picture for free use by Wikipedia and others under the CC by attribution license. So if anyone thinks that a picture of my hand would improve Wikipedia then they can make the change.

The German supermarket chain Aldi recently had a special deal of a “wine-fridge” for $99. A wine fridge really isn’t that specialised for wine, it is merely a fridge that has a heater and is designed for temperatures in the 11C to 18C range. An good wine fridge will have special wood (or plastic if cheap) holders for wine bottles. A particularly cheap wine fridge (such as the one from Aldi) doesn’t even have special holders for wine bottles. But this does however make it more convenient for storing other things.
In the hotter days in summer outside temperatures of over 35C are common and it’s possible for an uncommonly hot day to be in excess of 45C. My home air-conditioning system is only able to keep the ambient temperature about 10C cooler than the outside temperature if there are a few hot days in a row.
According to the Wikipedia page the best chocolate is supposed to have type V crystals which melt at 34C [1]. So if the outside temperature is 45C then the temperature inside my home is almost guaranteed to be hot enough to melt chocolate. If I’m not at home (and therefore the air-conditioner is off) during a moderately hot day then it’s common to have a temperature of about 30C inside my house. The Wikipedia page also notes that moving chocolate between extremes of temperature can cause an oily texture and that storing chocolate in the fridge can cause a white discoloration. I’ve experienced these effects and find that they significantly decrease the enjoyment of chocolate.
So now I have a fridge in my computer room set to 16C because according to Wikipedia the ideal temperature range for storing chocolate is from 15C to 17C (the photo was taken shortly after turning it on and it hadn’t reached the correct temperature). Every computer room should have a fridge full of chocolate!
If my stockpile of chocolate reduces I may even put some wine in the fridge (I could probably fit some now if I organised the chocolate in a better way). But that depends on the supermarkets, if they have a special on Green and Black’s “Maya Gold” organic fair-trade chocolate then my fridge will become full again.
|
|