2

Appropriate Talks about Porn

There is currently some discussion about a talk which used pornographic imagery and jokes to illustrate points about Ruby programming [1]. A similar event happened in 2006 here is the description of the event from the author – which includes an unreserved apology [2].

It seems to me that the current discussion focusing on what is inappropriate for a public lecture is the wrong way to do it as there is a vast range of inappropriate material. I suggest that instead a white-list of appropriate references to porn in lectures will be more effective – if nothing else it makes for a much smaller list. Here is a first draft of such a list:

  1. Pornographic web sites handle a lot of traffic. There are significant technical problems that need to be solved. A lecture from an employee of an Internet porn company which covers the solutions to those technical problems would be of interest to many system administrators. Of course such a lecture should not promote the Internet porn company or show any samples of their products.
  2. Digital processing of images is an interesting topic. Having a digital editor from a magazine such as Vogue describe in detail how they do their job would be really interesting. There is a lot of overlap between the range of pictures displayed in Vogue and those displayed on porn sites. Having an employee of a porn company demonstrate how they touch up the picture of a fully clothed model would be an interesting technical topic, but of course it would be totally inappropriate to make any specific mentions of how the parts of the picture which are not PG-13 rated are edited. Even showing a picture of a porn star might be controversial, but I’m sure that the same work could be reproduced with a photo of someone who has a more respectable career. Touching up a picture of RMS to make him look like a politician would make for a challenge for the presenter and an interesting lecture.
  3. The image known as Lenna is a photograph of a Playboy model named Lena that is widely used to test image compression [3]. While the image remains controversial, it seems to me that it would be impossible to give a complete and factual account of the history of image compression without mentioning it.
  4. The police have great discretionary powers to determine which crimes should be actively investigated. Senior police decide how many resources to assign to each case. I believe that in many jurisdictions the police will assign a much lower priority to a hacking case if the victim is running a porn service. Rumor has it that porn sites put a lot more effort into system security than most Internet services, partly due to not having as much protection from the police as other industries and partly because their customers don’t want to be identified. I would be very interested in attending a presentation about practical computer security by a system administrator from a porn site. As an aside I’m always interested in talking to people who do security work, so I would like to have a chat with someone who does such work for a porn site.
  5. A few years ago I attended a lecture about the security implications of porn surfing. It had some scary statistics on the number of porn sites that try to deploy malware on the computers of the people who view porn, and it made a good business case for banning porn at work without reference to HR issues (which is very relevant for the jurisdictions where viewing porn at work is not considered to be a social problem). I would like to see a new version of that talk with statistics based on more recent research, my theory is that modern porn sites are more toxic than the old ones due to the general increase in criminal activity on the net – but I have no evidence to support it.

Of course in all cases jokes about porn are not acceptable, mentions of porn need to be strictly on the basis of historical analysis or the description of technical and legal issues which are relevant to the audience. Delivering a talk about porn without inappropriate jokes would take a great deal of effort, but it can be done (and I’ve seen it done once). For these five cases (and the few others that will probably be suggested in comments) it would probably be best if the conference organisers viewed the talk first to ensure that there was no misunderstanding about what is appropriate.

I think that comparing a short list of specific cases where porn can reasonably be mentioned in a public lecture with the vast number of potential inappropriate references illustrates the probability of a random porn reference being acceptable. The probability of making a random porn reference that is appropriate is probably slightly less than that of winning the lottery.

10

Australian Democracy is “Microsoft Compatible”

Here is the Australian Electoral Commission documentation on how to register a political party [1]. It includes the requirement for “A Microsoft compatible electronic membership list (and paper copy) providing the following information“.

So a prerequisite for registering a political party appears to be the ownership of a PC running Windows. While it may be the case that I could create a plain text file on a Linux machine and append some CR characters to each line, or create a CSV format spread-sheet/database file the most common interpretation of this is likely to be that MS-Office is required.

Such blatant promotion of a software vendor in a government document is unacceptable. Anyone who wishes to use other software for their political activities should be permitted to do so without restriction.

11

Real-World Car Safety Tests

The car safety tests that are required for every new mass-market passenger vehicle are flawed in many ways. Here is a list of the most obvious flaws (please point out any that I’ve missed):

  1. There has been no research to make accurate crash-test dummies to represent women and children, and I believe that there has been no research to make crash-test dummies to accurately represent people of racial groups that are not common in the US. Basically the medical research used to make crash test dummies was performed on male cadavers that were readily available in the US.
  2. The standard tests involve a direct collision with a centrally targeted stationary object, a direct collision with an offset stationary object, and solid objects (representing cars) hitting the vehicle from the read and the side. These simulate crashes where there is little or no attempt made to avoid the collision, they are probably really good for protecting drunk drivers. But any sane and sober driver is probably going to make some effort to avoid the collision and the resulting impact will not be at a multiple of 90 degrees. Note that when a car directly hits the side of a moving car it is quite different to hitting the side of a stationary car (which is what is tested).
  3. There are no standard tests for the probability of a vehicle rolling in the event of a crash or of what would happen to the occupants if it was to roll. Rollover crashes are among the most dangerous…
  4. The tests do not take into account the ability of the driver to avoid a crash or minimise the damage. The ability to avoid crashes is a major advantage for cars with a low center of gravity, AWD, and traction control. It’s a major problem for vehicles with a high center of gravity and with tires that are not designed for road use (IE 4WD/SUV vehicles).

But generally the crash-test results are of some use provided that you start by looking at the results from vehicles that have good safety features such as the Audi Quattro, the AWD version of the VW Passat, a Mercedes with 4MOTION, or any other vehicle with constant four wheel drive, road tires, four wheel traction control, and a low center of gravity.

The RACV (the main car owners advocacy organisation in Victoria and also a major car insurance company) [1] has published the used car safety ratings report [2]. This was produced by the Monash University Accident Research Centre and is based on the analysis of 3,000,000 crashes reported to police in Australia and New Zealand. Results are only available for cars which have been in common use on Australian and New Zealand roads for some time (so there aren’t many entries for vehicles that are less than 5 years old or for particularly expensive vehicles).

The report also includes estimates on the purchase prices of some of the safest vehicles. A vehicle that is significantly better than average can be purchased for as little as $5000!

Now if you want to buy a new vehicle then choosing the latest version of a model that has rated well on the used-car tests should be safe if the new car crash tests also report good results. It seems likely that the latest Mazda 6 or VW Passat will also rate well on the used-car tests in a few years time. It’s a pity that the report didn’t note which of the vehicles that rated well have models that have good features to avoid collisions such as EBA, ABS, AWD, and traction-control.

A friend who is active in the free software community recently had a very lucky escape from a significant crash. From his description I doubt that car safety features had much to do with him escaping without injury, I think that it was mostly luck. While his car did have a good range of safety features (and was rated well on the used-car tests), a high-speed collision that involves a truck can easily result in a car being squashed flat. I have already sent him the RACV link which he is using as part of the process to decide what new car to purchase. But I think that this information needs to be spread more widely.

I have not searched for information on such analysis of crashes being performed in other countries, please leave a comment if you know of any good research that will be useful for other people. One thing to note however is that given the global scope of car manufacturing, results from one country will have some validity in others. I expect that a VW Passat that is sold in Germany or the US will be almost identical to the Australian version.

3

Google Server Design

Cnet has an article on the design of the Google servers [1]. It seems that their main servers are 2RU systems with a custom Gigabyte motherboard that takes only 12V DC input. The PSUs provide 12V DC and each system has a 12V battery backup to keep things running before a generator starts in the event of a power failure. They claim that they get better efficiency with small batteries local to the servers than with a single large battery array.

From inspecting the pictures it seems that the parts most likely to fail are attached by velcro. The battery is at one end, the PSU is at the other, and the hard disks are at one side. It appears that it might be possible to replace the PSU or the battery while the server is operational and in the rack.

The hard disks are separated from the motherboard by what appears to be a small sheet of aluminium which appears to give two paths for air to flow through the system. The thermal characteristics of the motherboard (CPUs) and the hard drives are quite different to having separate air flows seems likely to allow warmer air to be used in cooling the system (thus saving power).

Google boast that their energy efficiency now matches what the rest of the industry aims to do by 2011!

The servers are described as taking up 2RU, which gives a density of one CPU per RU. This surprised me as some companies such as Servers Direct [2] sell 1RU servers that have four CPUs (16 cores). Rackable systems [3] (which just bought the remains of SGI) sells 2RU half-depth systems (which can allow two systems in 2RU of rack space) that have four CPUs and 16 cores (again 4 CPUs per RU). Rackable systems also has a hardware offering designed for Cloud Computing servers, those CloudRack [4] systems have a number of 1RU trays. Each CloudRack tray can have as many as two server boards that has two CPUs (4 CPUs in 1RU) and 8 disks.

While I wouldn’t necessarily expect that Google would have the highest density of CPUs per rack, it did surprise me to see that they have 1/4 the CPU density of some commercial offerings and 1/8 the disk density! I wonder if this was a deliberate decision to use more server room space to allow slower movement of cooling air and thus save energy.

It’s interesting to note that Google have been awarded patents on some of their technology related to the batteries. Are there no journalists reading the new patents? Surely anyone who saw such patents awarded to Google could have published most of this news before Cnet got it.

Now, I wonder how long it will take for IBM, HP, and Dell to start copying some of these design features. Not that I expect them to start selling their systems by the shipping crate.

4

Why Cyrus Sucks

I’m in the middle of migrating a mail server away from the Cyrus mail store [1]. Cyrus provides a POP and IMAP server, a local delivery agent (accepting mail via LMTP). It is widely believed that Cyrus will give better performance than other mail stores, but according to a review by linux-magazin.de Dovecot and Courier deliver comparable (and sometimes better) performance [2].

The biggest problem with Cyrus is that it is totally incompatible with the Unix way. This wouldn’t be a problem if it would just work and if it would display reasonable error messages when it failed, but it doesn’t. It often refuses to work as desired, gives no good explanation, and it’s data structures can’t be easily manipulated. Dovecot [3] and Courier [4] use the Maildir++ format [5] (as well as many other programs). I have set up a system with Courier Maildrop and Dovecot for the IMAP server [6] and it works well – it’s good to have a choice! But also Maildir++ is reasonably well documented and is an extension to the well known Maildir format. This means that it’s easy to manipulate things if necessary, I can use mv to rename folders and rm to remove them.

Cyrus starts with a database (Berkeley DB file) of all folders in all mailboxes. Therefore it is not possible to move a user from one back-end server to another by merely copying all the files across and changing the LDAP (or whatever else contains the primary authentication data) to point to the new one. It also makes it impossible to add or remove folders by using maildirmake or rm -rf. The defined way of creating, deleting, and modifying mailboxes is through IMAP. One of the problems with this is that copying a mailbox from one server to another requires writing a program to open IMAP connections to both servers at once (tar piped through netcat is much faster and easier). Also if you need to rename a mailbox that contains many gigabytes of mail then it will be a time consuming process (as opposed to a fraction of a second for mv).

Cyrus has a tendency to break while Dovecot is documented as being self-healing and Cyrus also seems to cope well in the fact of a corrupted mail store. Even manually repairing problems with Cyrus is a painful exercise. The Cyrus mail store is also badly designed – and it’s design was worse for older filesystems (which were common when it was first released) than it is for modern ones. The top level of a Cyrus maildir contains all the messages in the INBOX stored one per file, as well as three files containing Cyrus indexes and sub-directories for each of the sub-folders. So if I want to discover what sub-folders a mailbox has then I can run ls and wait for it to stat every file in the directory or I can use an IMAP client (which takes more configuration time). As opposed to a Maildir++ store where every file that contains a message is stored in a folder subdirectory named “new“, “cur“, or “tmp” which means that I can run ls on the main directory of the mail store and get a short (and quick) result. Using tools such as ls to investigate the operation of a server is standard practice for a sysadmin, it should work well!

A finall disadvantage of Cyrus seems to have many small and annoying bugs (such as the reconstruct program not correctly recursing the sub folders). I guess it’s because not many people use Cyrus that such things don’t get fixed.

One trivial advantage of Cyrus is that by default it splits users into different sub-directories for the first letter of the account name. Dovecot supports using a hash of the user-name this is better than splitting by first-letter for performance (it gives a more equal distribution) but will make it slightly more difficult to manipulate the mail store by script. Ext3 can give decent performance without a two level directory structure for as many as 31,998 sub-directories (the maximum that it will support) due to directory indexing and Linux caching of dentries. There may be some other advantages of Cyrus, but I can’t think of them at the moment.

Here is a script I wrote to convert Cyrus mail boxes to Maildir++. To make this usable for a different site would require substituting a different domain name for example.com (or writing extra code to handle multiple domains) and inserting commands to modify a database or directory with the new server name. There is no chance of directly using this script on another system, but it should give some ideas for people performing similar tasks.
Continue reading

7

Maildrop, IMAP, and Postfixadmin

I have recently configured my mail server to use IMAP. I started doing this when I was attending Linux.conf.au so that I could read urgent mail using my EeePC while at the conference and then be able to deal with the more complex stuff using my laptop later on.

The next logical step is to have mail delivered to different folders in the IMAP account. While there are ways of doing this via the Subject and other header fields, my needs are not that great. All I need to do is to support user+extension@example.com going to a folder named extension in the user’s mail store. While changing my mail server I decided to install Postfixadmin at the same time.

My first attempt to use Maildrop was to put the following in the /etc/postfix/main.cf file:
mailbox_command = /usr/bin/maildrop -d mail -f “$SENDER” “$DOMAIN” “$USER” “$EXTENSION”

That seems to only work when you have local accounts, so I ended up setting fallback_transport = maildrop and then putting the following in /etc/postfix/master.cf:

maildrop unix – n n – – pipe flags=DRhu user=vmail argv=/usr/bin/maildrop -d vmail ${nexthop} ${user} ${extension}

Where vmail is a Unix account I created for storing mail. Then I added the following to /etc/postfix/main.cf. Some of these are probably redundant (such as the virtual_mailbox_base). The recipient limit is set to 1 because there are no command-line parameters for maildrop to support two recipients for the same message.
virtual_alias_maps = mysql:/etc/postfix/mysql_virtual_alias_maps.cf
virtual_mailbox_domains = mysql:/etc/postfix/mysql_virtual_domains_maps.cf
virtual_mailbox_maps = mysql:/etc/postfix/mysql_virtual_mailbox_maps.cf
virtual_gid_maps = static:2000
virtual_uid_maps = static:2000
virtual_mailbox_base = /mail
vmaildir_destination_recipient_limit = 1
virtual_transport = maildrop
maildrop_destination_recipient_limit = 1

The files /etc/postfix/mysql* all have fields user=, password=, hosts=, and dbname=. The queries in each of the files are as follows:
mysql_virtual_alias_maps.cf:query = SELECT goto FROM alias WHERE address='%s' AND active = 1
mysql_virtual_domains_maps.cf:query = SELECT domain FROM domain WHERE domain='%s'
mysql_virtual_mailbox_maps.cf:query = SELECT maildir FROM mailbox WHERE username='%s' AND active = 1

The /etc/courier/maildroprc file has the following contents:

# log the maildrop actions
logfile "/var/log/maildrop.log"
#
# parameters 1, 2, and 3 are the domain, user, and extension
DOMAIN=tolower("$1")
USER=tolower("$2")
EXTENSION=tolower("$3")
DEFAULT="/mail/$DOMAIN/$USER"
#
# try making a backup (cc) copy of the mail but do not abort if it fails
exception {
  cc "$DEFAULT/.backup/"
}
#
# try delivering to the extension folder but do not abort if it fails
exception {
  if(length("$EXTENSION") != 0 && "$EXTENSION" ne "backup")
  {
    to "$DEFAULT/.$EXTENSION/"
  }
}
#
# deliver to the main inbox if there is no folder matching the extension or if no extension is specified
to "$DEFAULT/"

Installing Postfixadmin [1] was another challenge entirely. One of the complications of this is that there is no Debian package for Lenny (it seems that there will be one in Squeeze – Lenny+1).

I found David Goodwin’s tutorial on installing Postfixadmin and lots of other things on Debian/Etch [2] to be a very useful resource. I look forward to seeing a Lenny version of that document.

Please let me know if you can think of any way to improve this.

Employment Packages

Paul Wayper has said that he only wants to work for companies that will send him too LCA [1]. While that criteria is quite reasonable it seems overly specific. Among other things the varying location of LCA will result in the expense for the employer varying slightly year by year – which employers generally don’t like.

I believe that a better option is to have an employment package that specifies a certain amount of money (related to the gross income of the employee) should be set aside for training, hardware, or other expenses that help the employee (or their colleagues) do their job. Such an option would probably only be available to senior employees who are most able to determine the most effective way of spending the money.

For example an employee who earns $100,000 per annum might be permitted to assign 10% of their income ($10,000) to training or hardware that assists their job. Modern PCs are so cheap that any reasonable hardware requirements could fit within that budget with ease.

There are several benefits to such a scheme. On many occasions I have had colleagues who had inadequate hardware to do their work, slow PCs with small screens really impacted their productivity, in such situations buying a $400 PC and a $400 monitor for each person in the team could make a significant direct improvement to productivity before the impact on moralle kicked in!

Some years ago Lenovo ran some adverts for Thinkpads which said “demand one at the interview”. That made sense when a Thinkpad was an expensive piece of hardware. While there are still some expensive Thinkpads, there is a good range of cheap models, two which cost less than $1200AU and another eight which cost between $1200 and $1800. Now it makes more sense to allow each employee to choose their own hardware (both desktop and portable) and not even bother about issues such as whether the IT department blesses them. As Tom Limoncelli suggested in his LCA keynote, users are going to take more control over their environment whether the IT department like it or not, so it’s best to work with them rather than fighting them.

For training a common problem is that management can’t correctly determine which conferences are worthy of the expense of sending their technical staff. Then when a conference is selected they send everyone. It seems to me that when there are a number of conferences in a region (EG AUUG, LCA, OSDC, and SAGE-AU in Australia) there is a benefit in having someone from your team attend each one. Planning at the start of the year which conferences will be attended by each team member is something that appears to be beyond the ability of most managers as it requires knowing the technical interests and skill areas of most staff. If each employee was granted one week of paid time per year to attend conferences and they could determine their own budget allocation then they would be able to work it out themselves in a more effective manner.

Per-process Namespaces – pam-namespace

Mike writes about his work in using namespaces on Linux [1]. In 2006 I presented a paper titled “Polyinstantiation of directories in an SE Linux system” about this at the SAGE-AU conference [2].

Newer versions of the code in question has been included in Debian/Lenny. So if you want to use namespaces for a login session on a Lenny system you can do the following:
mkdir /tmp-inst
chmod 0 /tmp-inst
echo “/tmp /tmp-inst/ user root” >> /etc/security/namespace.conf
echo “session required pam_namespace.so” >> /etc/pam.d/common-session

Then every user will have their own unique /tmp and be unable to mess with other users.

If you want to use the shared-subtrees facility to have mount commands which don’t affect /tmp be propagated to other sessions then you need to have the following commands run at boot (maybe from /etc/rc.local):
mount –make-shared /
mount –bind /tmp /tmp
mount –make-private /tmp

The functionality in pam_namespace.so to use the SE Linux security context to instantiate the directory seems broken in Lenny. I’ll write a patch for this shortly.

While my paper is not particularly useful as documentation of pam_namespace.so (things changed after I wrote it), it does cover the threats that you face in terms of hostile use of /tmp and how namespaces may be used to solve them.

4

Gmail and Anti-Spam

I have just received an email with a question about SE Linux that was re-sent due to the first attempt being blocked by my anti-spam measures. I use the rfc-ignorant.org DNSBL services to stop some of the spam that is sent to me.

The purpose of rfc-ignorant.org is to list systems that are run by people who don’t know how to set up mail servers correctly. But the majority of mail that is blocked when using them comes from large servers owned by companies large enough that they almost certainly employ people who know the RFCs (and who could for a trivial fraction of their budget hire such people). So it seems more about deliberately violating the standards than ignorance.

The person who sent me the email in question said “hopefully, Google knows how to make their MTA compliant with RFC 2142“, such hope is misplaced as a search for gmail.com in the rfc-ignorant.org database shows that it is listed for not having a valid postmaster address [1]. A quick test revealed that two of the Gmail SMTP servers support the postmaster account (or at least it doesn’t give an error response to the RCPT TO command that is referenced in the complaint). However Gmail administrators have not responded to the auto-removal requests, which suggests that postmaster@gmail.com is a /dev/null address.

However that is not a reason to avoid using Gmail. Some time ago Gmail took over the role of “mail server of last resort” from Hotmail. If you have trouble sending email to someone then using a free Gmail account seems to be the standard second option. Because so many people use Gmail and such a quantity of important mail is sent through that service (in my case mail from clients and prospective clients) it is not feasible to block Gmail. I have whitelisted Gmail for the rfc-ignorant.org tests and if Gmail starts failing other tests then I will consider additional white-lists for them.

Gmail essentially has a monopoly of a segment of the market (that of free webmail systems). They don’t have 100%, but they have enough market share that it’s possible to ignore their competitors (in my experience). When configuring mail servers for clients I make sure that whatever anti-spam measures they request don’t block Gmail. As a rule of thumb, when running a corporate mail server you have to set up anti-spam measures to not block the main ISPs in the country (this means not blocking Optus or Telstra BigPond for Australian companies) and not block Gmail. Not blocking Yahoo (for “Yahoo Groups”) is also a good thing, but I have had a client specifically request that I block Yahoo Groups in the past – so obviously there is a range of opinions about the value of Yahoo.

Someone contacted Biella regarding an email that they couldn’t send to me [2]. I have sent an email to Biella’s Gmail account from my Gmail account – that should avoid all possibility of blocking. If the person who contacted Biella also has a Gmail account then they can use that to send me email to my Gmail account (in the event that my own mail server rejects it – I have not whitelisted Gmail for all my anti-spam measures and it is quite possible for SpamAssassin to block mail from Gmail).

It turns out that the person in question used an account on Verizon’s server, according to rfc-ignorant.org Verizon have an unusually broken mail server [3].

If your ISP is Optus, BigPond, Verizon, or something similarly broken and you want to send mail to people in other countries (where your ISP is just another annoyance on the net and not a significant entity that gets special treatment) then I suggest that you consider using Gmail. If nothing else then your Gmail account will still work even after your sub-standard ISP “teaches you a lesson” [4].

10

Flash Storage and Servers

In the comments on my post about the Dell PowerEdge T105 server [1] there is some discussion of the internal USB port (which allows the use of a USB flash device for booting which is connected inside the case).

This is a really nice feature of the Dell server and something that would be useful if it was in more machines. However I believe that it would be better to have flash storage with a SATA interface on the motherboard. The cost of medium size flash storage (eg 4G) in the USB format is not overly great, if soldered to the motherboard or connected to a daughter-board the incremental price for the server would be very small. Dell servers shipped with a minimum of an 80G SATA disk last time I checked, it seems quite likely to me that Dell could reduce the prices of their servers by providing flash storage on the motherboard and having no hard disk.

It seems likely to me that there is a significant number of people who don’t want the default hard drive that ships with a Dell server. The 80G disk that came with my PowerEdge is currently gathering dust on a shelf, it was far to small to be of any use in that machine and Dell’s prices for bigger disks were outrageous so I replaced the default disk with a pair of big disks as soon as the server had passed some basic burn-in tests. Most servers that I run fall into one of two categories, machines which primarily do computation tasks and need little storage space or IO capacity (in which case 4G of flash would do nicely) and machines which have databases, backups, virtual machine images, and other big things (in which case anything less than 160G is silly and less than 500G makes no economic sense in today’s market). Note that for a machine with small storage requirements I would rather have a 4G flash device than an 80G disk, I am inclined to trust flash to not die but not trust a single disk, two 80G disks means more noise, heat dissipation, and expense.

According to comments on my previous post VMWare ESX requires a USB boot device, so if VMWare could be booted with a motherboard based flash device then that would be an ideal configuration for VMWare. In some mailing list discussions I’ve seen concern raised about the reliability of permanently connected USB devices, while I’ve only encountered USB problems related to buggy hardware and drivers other people have had problems with the electrical connection. So it seems that motherboard based flash could be expected to increase the reliability of VMWare servers.

The down-side to having flash permanently attached to the motherboard is of course the impossibility of moving the boot device to different hardware. In terms of recovering from failure restoring a few gig of flash storage from backup is easy enough. The common debugging option of connecting a hard drive to another machine to fix boot problems would be missed, but I think that the positive aspects of this idea outweigh the negative – and it would of course be an option to not boot from flash.

If anyone knows of a tower server that is reasonably quiet, has ECC RAM and a usable amount of flash storage on the motherboard (2G would be a bare minimum, 4G or 8G would be preferred) then please let me know.