Archives

Categories

Tom’s Hardware falls victim to a trojan

E-Week has an article about the popular computer hardware review site Tom’s Hardware (tomshardware.com) being hit by a trojan in a banner advert.

From the article it’s not clear whether a criminal paid for a banner advert under a legitimate business name or compromised the advertising server run by an innocent third-party who paid for advertising on Tom’s Hardware.

But really it doesn’t matter very much for users. The facts that are clear are that Tom’s Hardware is a very reputable site (that I personally visit regularly and recommend highly) that apparently did nothing wrong. Yet Windows users who visited the site who hadn’t applied the latest patches had their systems compromised (and presumably used for other criminal activity). Apparently a month ago there was a patch released for the bug in question.

One thing that has to be noted is that large corporations often don’t apply patches immediately. Spending a month testing a patch before deploying it widely is not uncommon in an enterprise environment. The general thinking in an enterprise is that the employees are almost always prohibited from visiting porn sites, and often prohibited from using forums, and webmail services. With these things prohibited the risk of attack is dramatically reduced. Now there is evidence that even the most reputable sites run by the competent sys-admins can be vulnerable to such attack.

One possible method of alleviating such attacks would be to have sites that are supported by advertising also allow ad-free subscriptions. So if an enterprise wanted to use a site such as Tom’s Hardware without the risk of advert based attack then they could pay for an advert free subscription. I’m sure that it would be easy for an enterprise to pay Tom’s hardware more money than they would ever be likely to get from providing advertising to the employees of that company while still not having any impact on the IT training budget.

But the best solution is that a Windows machine that is used for main desktop work should not be used for web browsing (to any sites). A Linux or Mac OS/X desktop machine could be used for such web browsing with less risk due to having less security holes in the OS. Another option is to use VMWare, Xen, or another virtualisation technology to use a virtual machine for web browsing to make it a lot harder for an attacker to break out and compromise the main environment.

career risks

Paul Graham makes some interesting observations about taking risks to achieve career benefits.

One thing he doesn’t mention is that the risks have to match your life situation. If you are 21, living with your parents, and single (typical for a CS graduate) then you should take the riskiest options in terms of your career (apart from working in Iraq of course). If you don’t have much money then you don’t have much to lose. If you live with your parents then you still have accommodation and food even if you have no money. If you have no dependents (SO or children) then there’s nothing compelling you to earn a certain income.

When you get older you may get a mortgage, a SO, and/or children. Also you won’t live with your parents forever. Most career risks that you might want to take aren’t possible if you leave them too late.

Finally if you do something risky such as starting your own company and it doesn’t work out then it’s still going to look good on your CV. If you already have a lot of experience in the industry then the CV improvement may not be worth the time and effort invested in an unsuccessful company.

When I was 22 I (along with two business partners) started an Internet cafe. It went reasonably well (by the standards of small businesses), it lasted for a few years before cheap net access at home killed most of the business. At the time the cafe had to close the ISP side of the business was doing reasonably well and one of my partners bought the operating ISP business. This buy-out caused me to approximately break even out of the entire business which is a lot better than most small businesses do. When I was 26 I moved to London (I have dual nationality, UK and Australian). The experience I had gained from running my own business allowed me to immediately get contract work for large ISPs in Europe.

Most of the risks in my career were ones that I took while living with my parents. At the time I didn’t think through the issues of mortgages etc, my thinking was mostly along the lines of “it could work, I’m bored, so why not?”. ;)

Update: While in the process of writing this blog post I forwarded the URL of a dating service for scientists (sciconnect.com) to some friends. The main page has pictures of single people wearing lab coats and using laptops which I found amusing. I have no idea whether it’s a good service or not, but the pictures on the main page made it worth a look. It seems that I accidentally pasted the wrong URL into my blog post so people who were looking for the Paul Graham article ended up at the dating service instead. But I guess if you are the type of person who reads my blog and who is interested in a link to Paul Graham’s blog and you happen to be single then a dating service for scientists might be of some interest.

Thanks to MJ Ray for pointing out my error.

terrorist actions I want banned

The current trend in government seems to be to do whatever they want because to do otherwise invites (or fails to prevent) terrorism.

Here are some things that might be done by terrorists which governments should consider banning:

Graffiti – could be used by terrorists to mark locations for attacks or send messages to sleeper cells. It’s already illegal but that doesn’t seem to stop anyone. Send the graffiti “artists” to the same places that they send illegal immigrants.

Spitting in public – could be used for biological warfare (it’s effective at spreading disease).

Putting feet on seats of public transport. Shoes have been used for smuggling explosives on to commercial airline flights and could be used for bio-warfare.

Sticking gum underneath chairs. This is an obvious risk for bio-warfare.

Governments and corporations are banning photography, banning prayer in airports, and speaking in languages other than English. It’s about time that they banned something that is actually bad.

Five ways SE Linux may surprise you

Frank Mayer of Tresys has written a great article on the techtarget.com site about SE Linux.

It seems mostly aimed at managers and novice users and explains how SE Linux isn’t really that difficult to use but is however a foundation technology that is needed for secure systems.

Check it out!

permalinks in wordpress, Apache redirection, and other blog stuff

When I first put my new blog online I didn’t think to set the custom permalinks option to avoid having /index.php in all URLs (which wastes a few bytes and looks nasty).

So I decided to change to better URLs but unfortunately many people have already bookmarked the bad URLs. I wanted to give a HTTP 301 redirection when someone uses the old index.php version (so that bookmarks get updated) and then redirect to the PHP file. Unfortunately having a redirection from ^/index.php to a version without it and then a local rewrite to include index.php again doesn’t seem to work (any advice would be appreciated). So I put the following in my /etc/wordpress/htaccess file (the location for such things in Debian) so that foo.php is used instead where foo.php is a sym-link to index.php. I’m wondering whether I should file a bug report against the Debian package requesting that a sym-link be in the package to facilitate such things – if it’s not possible to do what I desire without the symlink.

RewriteEngine On
RewriteBase /
#RewriteCond %{REQUEST_URI} ^/index.php/?(.*$) [NC]
#RewriteRule . /%1 [R=301,L]
RewriteCond ^/robots.txt [L]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
#RewriteRule . /foo.php%1 [L]
RewriteRule . /index.php%1 [L]

Update: I am now using the permalink-redirect plugin (thanks for the tip Method) which solves the problem of the obsolete URLs as well as solving the problem of having two representations of the URL (with and without a trailing slash). I have updated the above htaccess file sample to reflect my new configuration (with the old settings commented out for the benefit of people who don’t want the permalink-redirect plugin).

The way WordPress allows the table prefix to be stored in the MySQL configuration section is very handy. Some time ago I asked for advice on a blog server for multiple users and WordPress-MU was recommended, but it seems that for most situations where you want multiple blogs the non-MU version of WordPress will do the job. It seems that the main benefit of WordPress-MU is that setting up multiple blogs doesn’t require running shell scripts, which for the cases I’m most interested in doesn’t compete with the benefit that the non-MU version has of being packaged in Debian.

On the topic of WordPress in Debian, it’s a pity that none of the plugins are packaged in Debian. I plan to create a repository for plugins and themes that I use if no-one else has started such a repository. I believe that a repository of Debian packages for such things will provide significant benefits to users, including updates for security reasons and having plugins that are known to work (some of the plugins appear to only work on Windows).

Also there are a few issues that I would like to improve in WordPress. One is that the Uncategorised category is selected by default so if I select another category and forget to de-select Uncategorised then it’s a little confusing. Another is that the categories are displayed in the side-bar without mentioning the number of matching posts. The way blogger lists the number of posts per category (and sorts the categories in order) is much more convenient. Also another advantage of blogger is the handling of archives where you can click on a month to see a list of the names of all posts in that month. I’m not about to go back, but it would be nice to have those features. Does anyone have any ideas how to solve these problems?

Update2:
I have added a rule to make robots.txt not redirect. Before adding this rule /robots.txt was redirected to /index.php/robots.txt which caused a WordPress page to load, this wasted a lot of bandwidth (robots.txt is hit often) and probably caused some spiders to ignore my site.

lemonup.com – pirates

The URL http://linuxresource.lemonup.com/ currently has a mirror of my blog. Disregarding the DMCA take-down notice I sent them a week ago (which is also mirrored on their own site) they have again copied the content from my site without permission (I only allow non-commercial use). But this time they go even further and claim copyright over my text!

This is going way too far. Now I’m going to ask their ISP to deal with them.

Update: Their site is now offline. Their ISP acted quite quickly and less than 3.5 hours after my complaint the entire site was offline (not only the section that had my posts). I suspect that it was the fact that they mirrored blog posts such as this one which made it appear to be willfull infringement which got such a fast response – but the only response I got from the ISP was to say that they would do what seemed right and not comment to me about it due to privacy reasons.

This is not an ideal outcome. I would much rather have had them respect my license terms without such measures. I only contacted their ISP because the first take-down request took four days to complete (after receiving a response on day 0 so it wasn’t four days of holiday for the operator) and because they then mirrored my site again under a different URL. I am still unsure of whether this was a genuine mistake (as claimed by the operator) due to lack of communication between multiple people involved in running the site, or whether they just didn’t think I would catch them.

I don’t have any malice towards the operators of lemonup, I have already offered some suggestions that may help them in future business ventures and would be happy to make some more suggestions if asked.

In response to a comment. The traditional meaning of the word pirate is violent acts at sea that don’t have state sponsorship, this usually involves armed robbery but the main criteria is violence without state sponsorship. The slang use means anything which goes against the wishes of a copyright holder.

school rating

The web site http://au.ratemyteachers.com/ allows Australian students to rate their teachers. Ratings are anonymous and give teachers a score out of 5 as well as allowing students to comment on teachers.

The Sydney Morning Herald has an article about the site that describes the actions that the NSW Department of Education and the NSW Teachers Federation are taking to block the site.

The solution to this however is really quite simple. There needs to be a formal method for students to rate their teachers which will be used when it comes time to give pay rises to good teachers and dismiss or transfer to non-teaching duties the teachers who can’t do their job.

I encourage students to submit essays and debate topics about the anonymous news-papers published in the Soviet Union and other repressive states, why they were necessary (because criticism of the government was prohibited) and why they were morally right (a system with no method of correction will inevitably do bad things). Then teachers will have a choice of supporting the actions of the Soviet Union or the use of ratemyteacher.com, it will be interesting to see which option they choose. I think that it’s most likely that they will take the hypocritical path and support anonymous newspapers in the Soviet Union while attacking such free speech in supposedly free countries.

It’s interesting that an article on the failures of Mentone Grammar has just been published. Maybe if Mentone had been listed on the ratemyteachers.com site the Taylor’s would not have made the mistake of sending their son there. Or maybe if the Mentone senior staff had been reading that site they would have been able to correct the problems before they became cause for a legal dispute.

DMCA etc

A few days ago I wrote my first DMCA take-down notice, I followed the instructions on the Wikipedia page. The reason for this was that someone was mirroring my blog and putting google adverts on the copy. Before I started putting Google adverts on my web sites I wouldn’t have been bothered about this. But now that I’m making a small amount of money from Google advertising I don’t want someone else just mirroring my content and taking the money away from me.

The person who managed the site in question took a surprisingly large amount of time to comply with the request (a discussion of several messages plus a couple of reminders over the course of a few days).

The most recent news about DMCA abuse is the case of trying to prevent the distribution of a code used for decrypting DVD-HD. It is widely believed that copyright was used to prevent the distribution. Strangely many people who otherwise have a good understanding of technology have been saying “you can’t copyright a number”. What precisely is a program binary if not a long series of numbers (or a single large number depending on how you look at it)? For that matter a JPEG file or the ASCII representation of a book is also either a very large number or a series of small numbers. Also apparently it’s not protected under copyright but under the anti-circumvention clause of the DMCA.

If it was a matter of copyright it would not be an issue of whether a number can be copyrighted, but what defines such a number. One criteria for copyright is that it has to be on something non-trivial (EG I couldn’t copyright the use of “a few days ago” as an introduction) so length is a criteria. Another is that it has to be a creative expression (so an encryption key can’t be copyright). However in many jurisdictions there are separate laws regarding distributing passwords without permission, such laws are designed for preventing people from granting unauthorised access to computers but I believe that they can be used more generally (I have been advised that such laws exist in the state of Pennsylvania in the US – I’m not sure what the law is in other regions but expect that something so useful would be copied).

Another breaking story is that the RIAA has created an organisation with a US government mandate to collect royalties on ALL music that is played over Internet radio. This includes music for which the copyright owner is not an RIAA member and does not consent to have the royalties applied. You can create your own music, grant free access to everyone out of philanthropy, and then have the RIAA tax the music!

It’s unfortunate that only the down-side of this dramatic change in copyright law has been discussed. Compulsory licenses have a lot of potential in other areas of copyright material. Recently people have been complaining that government sponsored scientific research is often only published in journals that cost large amounts of money. Why not have a compulsory license for journals at a fair price that everyone can afford? Software is often unreasonably expensive (Windows Vista with the latest version of MS Office can cost up to twice as much as a new PC), let’s have compulsory licenses for software at a reasonable fee! Software vendors often cease selling old versions of software to force customers to upgrade, a compulsory license scheme would permit us to buy MS-DOS 3.30 at a reasonable price regardless of whether MS wants to sell it.

Finally there is at least one evil cult that claims it’s “religious” texts are copyright as a way of preventing the public from seeing what a drug-addled second-rate sci-fi author produces. Let’s have a compulsory license for them so everyone can read them!

The only thing that’s wrong with the RIAA scheme is that there is no option for copyright owners to directly license their material to the users (including granting a free license if they so desire). The up-side of this is that it proves beyond all doubt that the RIAA is not representing copyright owners.

Update: I initially accepted the claims about the DMCA take-down notices being based on copyright rather than anti-circumvention. Since learning of my mistake I modified this post to reflect the fact that it was not a copyright issue.

LUG talks today

Today I gave three talks at my local LUG. The first was my latest SE Linux talk (I’ll put the notes online soon). The second was a talk about voting.

I asked for a show of hands, who has already decided which party they will vote for at the next federal election (about 12 people put their hands up). I then asked people to put their hands down if they were not a member of the party that they intend to vote for, including myself there were only two raised hands in the room (including mine)!

With the way party politics works nowadays the major parties are not very interested in representing their core voters. Why try to please for people who will vote for you anyway? Instead they try to appeal to swinging voters and pressure groups. If you have decided to vote for a party they have no reason to try and impress you. Therefore you should join the party and try and influence the policy decision making process from within.

The issues that I believe are most important to the Linux community are free software use in government, sane intellectual property laws, the right to a fair trial, and not pandering to the US (which is related to the previous two points).

If you have already decided who to vote for then you should join that party and make your vote count in the party room.

One member of the audience said that he had been a member of one of the major parties but that the internal politics turned him off. If that is your experience then I think you should ask yourself whether you want to vote for a group of people that you can’t work with.

The final talk I gave was about getting speakers for Linux Users’ Groups. There is always difficulty in finding speakers for clubs. Ideally we would have meetings planned a few months ahead of time so that they could be advertised in various ways. Newspapers often have columns dedicated to providing information about public meetings but the lead time is usually at least a week (and the meeting would have to be advertised at least two weeks in advance – so more than a month’s planning ahead is required).

Getting a larger number and variety of speakers will attract new members, encourage existing members to attend more meetings, and inspire members in their Linux work.

Talks can be given by almost anyone. There is a constant demand for speakers who have expert knowledge in the topic, but anyone who is a decent speaker and has the confidence to stand up at the podium can give a good talk. For expert speakers possibilities include academics, industry leaders, leaders of free software development projects, and journalists. But that’s not all, anyone who wants to spend the time researching a topic can give a talk on it. For example I’ve been learning about MySQL recently for my own servers and will probably offer a talk about MySQL aimed at sys-admins who don’t want to become DBAs but who just want to get a database running. I’m not a MySQL expert (and don’t plan to become one) but I believe that there are many people who want to do the things I do with MySQL and who could benefit from a talk that I might give.

The best place to find speakers is a conference or trade-show. If they give a talk that works well you can suggest that they give it again for your local LUG. You can also find speakers at conferences that you can’t attend. If someone visits your country for a trade-show in a different city you could send them an email saying “unfortunately I can’t attend your talk, but if you are interested in visiting my city in the same trip then there will be an audience of X people interested in seeing you”.

There’s no harm in asking, the worst that they can do is decline. Ask everyone who you think can do a good job. Also make sure that you don’t make any commitment (unless you are member of the LUG committee).

more about Heartbeat

In a comment on my blog post “a Heartbeat developer comments on my blog post” Alan Robertson writes:
I got in a hurry on my math because of the emergency. So, there are even more assumptions (errors?) than I documented.
In particular, the probability model I gave was for a particular node to fail. So the probability of either of two failing would be double that, and either of three failing would be triple that.
Note that the probability of multiple simultaneous failures goes up as a power, but the probability of either of only goes up linearly.
I really need to sit down and do the math carefully – but the idea of the simultaneous failures going up as a power is true. And the “any of” probability goes up linearly. That’s also true. This is why people can actually use larger HA clusters ;-).
The 5 years figure is the industry standard quoted figure for an average Intel-based server to fail.
The four hours to repair is a common high-quality of service response time from a hardware vendor. I admit that’s not the same as actual repair time, but if some “repairs” are just reboots, then it’s not a horrible number to start with – if your vendor has cached some spares nearby. I suppose I should sit down and do the math right, and make a spreadsheet of it. (I wonder if I remember that much math?)
I assume disk failures are taken care of by hot swap disks, RAID, etc. and so in effect they “never fail” (at least not totally) so that these failures don’t have to be accounted for by the overall availability model.
Here’s an intuitive way of thinking about it “from your gut”…
If I took your whole data center and made a cluster out of it, what’s the chance that at least half of your servers would fail at once?
Pretty darn small, is the short answer ;-). If it’s not pretty darn small, you need to buy better servers, and IBM has just the servers for you ;-). Or maybe they need to hire a better SysAdmin ;-)
If you ask yourself “when is the last time at least half my machines in my data center couldn’t communicate with the other half”, then hopefully that’s also a “pretty darn small” chance too. If not, there are well-known methods for making networks highly reliable too.
[I’m still ignoring “catastrophes” that you haven’t accounted for in your HA architecture].
I’m not saying this is free, and it can be pricey. One of my other favorite sayings is “Paranoia is an expensive hobby”. How much do you want to spend?
You tell me how much you want to spend, and you can figure out how to spend it.
I’ll make a separate comment on quorum models later. It’s getting late here.

My only comment in response to that is to say that I still believe the calculations of probability to be correct in my original post and I am interested to see someone prove otherwise.

Another comment by Alan:
Data corruption, no doubt, is almost always much worse than loss of availability. And some kinds of data corruption are worse than others. For example, mounting a non-clustered shared disk filesystem twice simultaneously is usually much worse, than updating two replicas of the data simultaneously. In the first case, you have to restore to your previous backups and lose all data since then. In the second case, you only lose updates that were made to one of the sides, and you instantly have a working copy of the data which is nearly always much newer than your last backup (with the possibility of recovering them by significant effort). Typically you would only lose a few minutes of updates at worst – and depending on the kind of networking failure, you might not lose anything.
Heartbeats certainly aren’t enough. You need to monitor the health of your servers and the health of your applications. Heartbeat monitors applications and can easily be informed of and act on the health of your servers (with release 2 style Linux-HA Heartbeat configurations).

Followed by:
Since data corruption is so serious, this is why cluster designers worry so much about split-brain, which is managed using the ideas of quorum and it’s sibling fencing.
This is all about keeping bad things from happening.
This post is really about quorum, since Russ had expressed interest in it.
Quorum is the idea that you can uniquely choose a subcluster to represent the whole cluster in those cases where communication failure has caused the cluster to split into separate sub-clusters which cannot properly communicate with each other. In this way, only one of the subclusters continues on, and the others will sit on their hands and do nothing waiting for a person to fix things.
Some of the kinds of quorum mentioned below are better than others. But, most importantly, they can be used in combination as described later.
The most common kind of quorum is that Russ mentioned in his earlier post – the majority quorum. In this method, for a cluster of n nodes, you grant quorum to a sub-cluster which has more than INT(n/2) members. This means that if you have a 3-node cluster, you have to have two nodes to continue. If you have 4 nodes, you have to have 3 nodes to continue. For 5 nodes, you have to have 3 nodes, and so on.
Other basic methods include disk reserve, so that you have reserve a disk to have quorum. In this case, if only one node survives and it can reserve the disk, it continues to run. However, the disk becomes a single point of failure. This may not be a problem if this single disk is required to run any of the cluster services, since they would fail without it anyway. [Heartbeat does not support this method].
An analagous method is to implement a software resource which grants quorum to one subcluster in a fashion analagous to the disk reserve method. This has the advantage of not requiring disk reserves, or a shared disk, but it has the same SPOF disadvantage as the disk reserve method. Heartbeat does support this method using the quorum daemon. It’s incredibly useful for those cases (like split-site clusters) where you cannot use fencing.
Another method is to grant quorum to any subcluster which can ping a certain set of nodes, and not grant it to any which can’t access those nodes. This isn’t a wonderful method, and has obvious disadvantages with respect to uniqueness, and single points of failure. (Heartbeat doesn’t yet implement this one).
Another method is to grant quorum to any node which is a member of a 2-node cluster. This is better than losing quorum and stopping when one node stops, but obviously completely ignores the uniqueness requirement of quorum.
Another method is to ask a human being if you have quorum. This is hardly an ideal circumstance, but useful in some contexts as described below. (Heartbeat doesn’t yet implement this one).
Perhaps you say, really the only one of these that’s really good is the first one – the majority vote method.
And, I would generally agree with you. But, Heartbeat has the ability to use these in combination which makes some of those methods that seem flaky to be much more reasonable.
Heartbeat has the ability to have multiple quorum modules declared, and they’re used in this way: Any module can return HAVEQUORUM, NOQUORUM, or TIE. If they return HAVEQUORUM or NOQUORUM, then no further quorum modules are consulted. However, if they return TIE, then the next quorum module is consulted for its opinion. If the last quorum module returns TIE, it is treated the same as NOQUORUM.
This enables you to use one quorum module to break the tie declared by a previous quorum module.
You could then use the quorumd to break the tie created by a voting module. Or you could use the quorumd instead of the “two-node” module. Or you could use the “pingable” module instead of the “two-node” module. Or you could at the end always tack on a “human” module, in case all else returns TIE.
This is kind of cool, actually. My favorites for next implementation are the pingable and consult human modules.
And, of course, if your cluster loses quorum due to real server failures failures, there are always ways to work around it, with a little human intervention. One method is to tell Heartbeat to ignore quorum. Another is to tell Heartbeat to remove certain nodes from the cluster, after you verify that they’re really dead. And, I’m sure that in a pinch, some new methods will be invented. And some of them might actually work ;-).

Regarding the quorumd, it seems that this is an extra server that will generally run on another machine separate from the rest of the cluster. So if we had a two-node cluster with a quorumd then it would effectively be a three-node cluster where one node is not configured to run any resources. It seems that the simpler approach in many cases would be to merely have a three-node cluster with resources not configured to run on one of the nodes.

For example if I was running a mail server cluster for an ISP I might configure a three node cluster of the two mail server back-end machines and one other machine that is lightly loaded (EG a DNS server) and have it configured not to run MTA resoures on the DNS machine.