Archives

Categories

Bill Joy

Some time ago Bill Joy (who is famous among other things for being a co-founder of Sun) [1] wrote an article for Wired magazine titled “Why the future doesn’t need us” [2]. He wrote many sensible things but unfortunately focussed on the negative issues and didn’t receive a good response. On reading it today I thought more highly of the article than I did in 2000 when it was printed, largely due to having done some background research on the topic. I’ve recently been reading Accelerating Future [3] which has a more positive approach.

Now a talk by Bill Joy from 2006 has been published on the TED.com web site [4]. He starts by talking about “super empowered individuals” re-creating the 1918 flu. He also claims that “more technology super-empowers people more“. Such claims seem to be excessively hyped, I would be interested to see comments from someone who has a good knowledge of current bio-technology as to the merits of those claims.

He talks briefly about politics and has some good points such as “the bargain that gives us civilisation is the bargain to not use power” and “we can’t give up the rule of law to fight an asymmetric threat – which is what we seem to be doing“.

He mentions Moore’s law and suggests that a computer costing $1000 then (2006) might cost $10 in 2020. He seems to be forgetting the cost of the keyboard and other mechanical parts. I can imagine a high-end CPU which cost about $800 a couple of years ago being replaced by a $2 CPU in 2020, but I don’t expect a decent keyboard to be that cheap any time before we get fully automated nano-factories (which is an entirely separate issue). Even the PSU (which has a significant amount of government regulations for safety reasons) will have a floor cost that is a good portion of $10. Incidentally the keyboard on my EeePC 701 sucks badly, I guess I’m spoiled by the series of Thinkpad keyboards that I keep getting under warranty (which would cost me a moderate amount of money if I had to pay every time I wore one out). I will make a specific prediction, that by 2015 one of the better keyboards will comprise a significant portion of the entire cost of a computer system (more than a low end computer unit) – such that in some reasonable configurations the keyboard will be the most expensive part.

It would be good if PCs could be designed to use external PSUs (such as the Compaq Evo models that took laptop PSUs). Then the PSU, keyboard, and monitor would be optional extras thus giving a small base price. Given that well built PSUs and keyboards tend not to wear out as fast as people want to replace computers, it seems that financial savings could be provided to most customers by allowing them to purchase the computer unit without the extra parts. People like me who type enough to regularly wear out keyboards and who keep using computers for more than 5 years because they still work are in a small minority and would of course be able to buy the same bundle of computer, PSU, and keyboard that new users would get.

In 2020 a device the size of an iPaQ H39xx with USB (for keyboard and mouse) and the new DisplayPort [5] digital display interface (that is used in recent Lenovo laptops [6]) would make a great PDA/desktop. You could dock a PDA the way some people dock laptops now and carry all your data around with you.

Bill cites an example of running the Mac interface on an Apple ][ (does anyone know of a reference for this) as an example of older hardware being more effective with newer software. It’s a pity that often new software needs such powerful hardware. EG it’s only recently that hardware developments have overtaken the developments of OpenOffice to make it deliver decent performance.

He has an interesting idea of using insurance companies to replace government regulation of food and drugs. The general concept is that if you can convince an insurance company that your new drug is not a great risk to them so that they charge a premium you can afford then you could sell it without any further testing. Normally I’m in favor of government regulation of such things, but given the abject failures of the US government this idea has some merit. Of course there’s nothing stopping insurance companies from just taking a chance and running up debts that they could never hope to repay (in a similar manner to so many banks).

Finally I think it’s interesting to note the camera work that the TED people used. My experience of being in the audience for many lectures (including one of Bill Joy’s lectures in the late 90’s) is that a speaker who consults their notes as often as Bill does gives a negative impression to the audience. Note that I’m not criticising Bill in this regard, often a great talk requires some significant notes – very few people can deliver a talk in a convincing manner entirely from memory. It seems to me that the choices of camera angle are designed to give a better impression than someone who was seated in the audience might receive – there’s no reason why a video of a talk should be spoiled by seeing the top of the speaker’s head while they consult their notes!

Flash Storage and Servers

In the comments on my post about the Dell PowerEdge T105 server [1] there is some discussion of the internal USB port (which allows the use of a USB flash device for booting which is connected inside the case).

This is a really nice feature of the Dell server and something that would be useful if it was in more machines. However I believe that it would be better to have flash storage with a SATA interface on the motherboard. The cost of medium size flash storage (eg 4G) in the USB format is not overly great, if soldered to the motherboard or connected to a daughter-board the incremental price for the server would be very small. Dell servers shipped with a minimum of an 80G SATA disk last time I checked, it seems quite likely to me that Dell could reduce the prices of their servers by providing flash storage on the motherboard and having no hard disk.

It seems likely to me that there is a significant number of people who don’t want the default hard drive that ships with a Dell server. The 80G disk that came with my PowerEdge is currently gathering dust on a shelf, it was far to small to be of any use in that machine and Dell’s prices for bigger disks were outrageous so I replaced the default disk with a pair of big disks as soon as the server had passed some basic burn-in tests. Most servers that I run fall into one of two categories, machines which primarily do computation tasks and need little storage space or IO capacity (in which case 4G of flash would do nicely) and machines which have databases, backups, virtual machine images, and other big things (in which case anything less than 160G is silly and less than 500G makes no economic sense in today’s market). Note that for a machine with small storage requirements I would rather have a 4G flash device than an 80G disk, I am inclined to trust flash to not die but not trust a single disk, two 80G disks means more noise, heat dissipation, and expense.

According to comments on my previous post VMWare ESX requires a USB boot device, so if VMWare could be booted with a motherboard based flash device then that would be an ideal configuration for VMWare. In some mailing list discussions I’ve seen concern raised about the reliability of permanently connected USB devices, while I’ve only encountered USB problems related to buggy hardware and drivers other people have had problems with the electrical connection. So it seems that motherboard based flash could be expected to increase the reliability of VMWare servers.

The down-side to having flash permanently attached to the motherboard is of course the impossibility of moving the boot device to different hardware. In terms of recovering from failure restoring a few gig of flash storage from backup is easy enough. The common debugging option of connecting a hard drive to another machine to fix boot problems would be missed, but I think that the positive aspects of this idea outweigh the negative – and it would of course be an option to not boot from flash.

If anyone knows of a tower server that is reasonably quiet, has ECC RAM and a usable amount of flash storage on the motherboard (2G would be a bare minimum, 4G or 8G would be preferred) then please let me know.

SE Linux and Decrypted Data

There is currently a discussion on the Debian-security mailing list about how to protect data which came from an encrypted file. I was going to skip that one until someone summoned me by mentioning SE Linux.

The issue which was raised is that data from an encrypted file can be read from /dev/mem (for all memory of the machine) or /proc/<pid>/mem (for the memory of the process). It was suggested that SE Linux can prevent such attacks, however it’s not that simple.

In most SE Linux configurations there will be a domain with ultimate privileges, examples are unconfined_t for a default configuration (also known as targeted) and sysadm_t for a “strict” configuration. SE Linux differs from many security models (such as the Unix permissions model that we are familiar with) in that there is no inherent requirement for a domain with ultimate privileges. I have in the past written policy for machines in which no domain could read /dev/mem or modify the running policy. This meant that booting from installation media might be required to modify the configuration of the system – some people desire systems like this and have good reasons for doing so!

As the vast majority of SE Linux users will have a “targeted” configuration, and the majority of the rest will have a “strict” configuration with minimal changes (IE sysadm_t will work as expected) there will always be a domain that can access /dev/mem (I’m not certain that sysadm_t can directly access /dev/mem on all variants of the policy – but given that it can put SE Linux in permissive mode it has ultimate access to work around that).

This however doesn’t mean that SE Linux provides no benefit. A typical Linux system will have many daemons running as root, while a typical SE Linux system will have few processes running as root while also having a SE Linux context that grants ultimate privileges. The root owned processes that SE Linux constrains are generally the network facing daemons and other processes which provide a risk, while the root owned processes that are not constrained by SE Linux policy are generally processes related to early stages of the system boot and programs run as part of the system administrator’s login session – processes that are inherently trusted.

Now regarding the memory of the process (and the ptrace API), in most cases a process that is likely to be accessing such a file will run in a domain that permits such access to itself. If the process in question is run from a login session (ssh login etc) then other processes in the same session will have access to the data. It is quite possible to write policy to create a new domain for the program which accesses the decrypted data and to deny the domain the ability to ptrace itself, but it will require at least 10 minutes of work (it’s not a default configuration).

So while SE Linux will generally help you to improve your security no matter what your exact requirements may be, it’s not something that can be used to make such corner cases entirely disappear. In a more general sense the idea that only a single program can access certain data is almost unattainable in a typical system. Among other things to consider is the fact that the Linux kernel is one of the larger programs on a typical Linux system…

Keating College

Some time ago I spoke to Craig Keating about his plans for a new secondary school in the center of Melbourne. His plan was to focus on the core academic areas and cater to academically gifted students. He had some interesting ideas for his business, one of which was to pay teachers rates that are typical for private schools (higher rates than government schools) but not have any sport programs in the evenings or weekends (private schools typically require teachers to work every Saturday and one evening every week in coaching a sport). This would therefore give an hourly pay rate that was significantly higher than most private schools offered and would thus allow recruiting some of the most skilled teachers.

One of his ideas was to intentionally keep the school small so that every teacher could know every student. One of the problems with most schools is that they take no feedback from the students. It seems that this serious deficiency would be largely addressed if the teachers knew the students and talked to them.

He pointed out that in the history of our school system (which largely derived from the UK system) the private schools had a lot of sporting activities as a way of filling time for boarding students, given that few schools accept boarders (and those that do have only a small portion of the students boarding) the sports are just a distraction from study. This is not to say that sports are inherently bad or should be avoided. He encouraged parents to take their children to sporting activities that suit the interests of the child and the beliefs of the parents instead of having the child be drafted into a school sport and the parents being forced to take an unwilling child to sporting activities that they detest (which I believe is a common school experience).

My own observation of school sport is that it is the epicentre of school bullying. There is an inherent risk of getting hurt when engaging in a sport. Some children get hurt every lesson, an intelligent person who ran a school with an intensive sports program might statistically analyse the injuries incurred and look for patterns. Children who are not good at sport are targeted for attack, for example when I was in year 7 (the first year of high school) one of my friends was assigned to the “cork bobbing” team in the swimming contest – this involved a contest to collect corks floating in the toddler pool for the students who were really bad at swimming. At that moment I knew that my friend would leave the school as the teachers had set him up for more intensive bullying than he could handle. Yet somehow the government still seems to believe that school sports are good!

This is not to say that physical activity is bad, the PE 4 Life program [1] (which is given a positive review in the movie Supersize Me [2]) seems useful. It has a focus on fitness for everyone rather than pointless competition for the most skilled.

I have just seen a sad announcement on the Keating College web site [3] that they will not be opening next year (and probably not opening at all). The Victorian Registration and Qualifications Authority (VCRA) announced in November that the application to be registered as a school (which was submitted in March) was rejected.

The first reason for the rejection was the lack of facilities for teaching woodwork and metalwork. As the VCRA apparently has no problems registering girls’ schools that don’t teach hard maths (a teacher at one such school told me that not enough girls wanted to study maths) it seems unreasonable to deny registration to a school that doesn’t teach some crafts subjects and caters to students who aren’t interested in those areas.

The second reason was the lack of facilities for sport and PE. Given the number of gyms in the city area it seems most likely that if specific objections were provided eight months earlier then something could have been easily arranged to cover the health and fitness issues. When I spoke to Craig he had specific plans for using the city baths, gyms, and parks for sporting activities, I expect that most parents who aren’t sports fanatics would find that his plans for PE were quite acceptable.

The third reason is the claim that 600 square meters of office space is only enough to teach one class of 24 students. That would mean that 25 square meters is needed for each student! I wonder if students are expected to bring their own binoculars to see the teacher or whether the school is expected to provide them. :-#

The government has a list of schools that work with the Australian Institute of Sport [4]. These schools provide additional flexibility in studies for athletes and probably some other benefits that aren’t mentioned in the brief web page. I don’t object to such special facilities being made available for the small number of students who might end up representing Australia in the Olympics at some future time. But I think that a greater benefit could be provided to a greater number of students if there were a number of schools opened to focus on the needs of students who are academically gifted. This doesn’t require that the government spend any money (they spend hundreds of millions of dollars on the AIS), merely that they not oppose schools that want to focus on teaching.

Currently the government is trying to force Internet censorship upon us with the claim that it will “protect children” [5]. It seems obvious to me that encouraging the establishment of schools such as Keating College will protect children from bullying (which is a very real threat and is the direct cause of some suicides). While so far no-one has shown any evidence that censoring the net will protect any child.

Other Reasons for not Censoring the Net

Currently there is a debate about censoring the Internet in Australia. Although debate might not be the correct word for a dispute where one party provides no facts and refuses to talk to any experts (Senator Conroy persistently refuses all requests to talk to anyone who knows anything about the technology or to have his office address any such questions). The failures of the technology are obvious to anyone who has worked with computers, here is an article in the Sydney Morning Herald about it [1] (one of many similar articles in the MSM). I don’t plan to mention the technological failures again because I believe that the only people who read my blog and don’t understand the technology are a small number of my relatives – I gave up on teaching my parents about IP protocols a long time ago.

One of the fundamental problems with the current censorship idea is that they don’t seem to have decided what they want to filter and who they want to filter it from. The actions taken to stop pedophiles from exchanging files are quite different from what would be taken to stop children accidentally accessing porn on the net. I get the impression that they just want censorship and will say whatever they think will impress people.

I have previously written about the safety issues related to mobile phones [2]. In that document I raised the issue of teenagers making their own porn (including videos of sexual assault). About four months after writing it a DVD movie was produced showing a gang of teenagers sexually assaulting a girl (they sold copies at their school). It seems that the incidence of teenagers making porn using mobile phones is only going to increase, while no-one has any plans to address the problem.

The blog www.somebodythinkofthechildren.com has some interesting information on this issue.

Two final reasons for opposing net censorship have been provided by the Sydney Anglicans [3]. They are:

  1. Given anti-vilification laws, could religious content be deemed “illegal” and be filtered out? Could Sydneyanglicans.net be blocked as “illegal” if it carries material deemed at some point now or in the future as vilifying other religions? If it’s illegal in Vic say, and there isn’t state-based filtering (there wont be), will the govt be inclined to ban it nation wide?
  2. Given anti-discrimination laws, if Sydneyanglicans.net runs an article with the orthodox line on homosexuality, will that be deemed illegal, and the site blocked? You can imagine it wouldn’t be too hard for someone to lobby Labor via the Greens, for instance.

So the Sydney Anglicans seem afraid that their religious rights to discriminate against others (seriously – religious organisations do have such rights) will be under threat if filtering is imposed.

I was a bit surprised when I saw this article, the Anglican church in Melbourne seems reasonably liberal and I had expected the Anglican church in the rest of Australia to be similar. But according to this article Peter Jensen (Sydney’s Anglican Archbishop) regards himself as one of the “true keepers of the authority of the Bible” [4]. It seems that the Anglican church is splitting over the issues related to the treatment of homosexuals and women (Peter believes that women should not be appointed to leadership positions in the church to avoid “disenfranchising” men who can’t accept them [5]).

It will be interesting to see the fundamentalist Christians who want to protect their current legal rights to vilify other religions and discriminate against people on the basis of gender and sexual preference fighting the other fundamentalist Christians who want to prevent anyone from seeing porn. But not as interesting as it will be if the Anglican church finally splits and then has a fight over who owns the cathedrals. ;)

A comment on my previous post about the national cost of slow net access suggests that Germany (where my blog is now hosted) has better protections for individual freedom than most countries [6]. If you want unrestricted net access then it is worth considering the options for running a VPN to another country (I have previously written a brief description of how to set up a basic OpenVPN link [7]).

The National Cost of Slow Internet Access

Australia has slow Internet access when compared to other first-world countries. The costs of hosting servers are larger and the cost of residential access is greater with smaller limits. I read news reports with people in other countries complaining about having their home net connection restricted after they transfer 300G in one month, I have two net connections at the moment and the big (expensive) one allows me 25G of downloads per month. I use Internode, here are their current prices [1] (which are typical for Australia – they weren’t the cheapest last time I compared but they offer a good service and I am quite happy with them).

Most people in Australia don’t want to pay $70 per month for net access, I believe that the plans which have limits of 10G of download or less are considerably more popular.

Last time I investigated hosting servers in Australia I found that it would be totally impractical. The prices offered for limits such as 10G per month (for a server!) were comparable to prices offered by Linode [2] (and other ISPs in the US) for hundreds of gigs of transfer per month. I have recently configured a DomU at Linode for a client, Linode conveniently offers a choice of server rooms around the US so I chose a server room that was in the same region as my client’s other servers – giving 7 hops according to traceroute and a ping time as low as 2.5ms!

Currently I am hosting www.coker.com.au and my blog in Germany thanks to the generosity of a German friend. An amount of bandwidth that would be rather expensive for hosting in Australia is by German standards unused capacity in a standard hosting plan. So I get to host my blog in Germany with higher speeds than my previous Australian hosting (which was bottlenecked due to overuse of it’s capacity) and no bandwidth quotas that I am likely to hit in the near future. This also allows me to do new and bigger things, for example one of my future plans is to assemble a collection of Xen images of SE Linux installations – that will be a set of archives that are about 100MB in size. Even when using bittorrent transferring 100MB files from a server in Australia becomes unusable.

Most Australians who access my blog and have reasonably fast net connections (cable or ADSL2+) will notice a performance improvement. Australians who use modems might notice a performance drop due to longer latencies of connections to Germany (an increase of about 350ms in ping times). But if I could have had a fast cheap server in Australia then all Australians would have benefited. People who access my blog and my web site from Europe (and to a slightly lesser extent from the US) should notice a massive performance increase, particularly when I start hosting big files.

It seems to me that the disadvantages of hosting in Australia due to bandwidth costs are hurting the country in many ways. For example I run servers in the US (both physical and Xen DomUs) for clients. My clients pay the US companies for managing the servers, these companies employ skilled staff in the US (who pay US income tax). It seems that the career opportunities for system administrators in the US and Europe are better than for Australia – which is why so many Australians choose to work in the US and Europe. Not only does this cost the country the tax money that they might pay if employed here, but it also costs the training of other people. It is impossible to estimate the cost of having some of the most skilled and dedicated people (the ones who desire the career opportunities that they can’t get at home) working in another country, contributing to users’ groups and professional societies, and sharing their skills with citizens of the country where they work.

Companies based in Europe and the US have an advantage in that they can pay for hosting in their own currency and not be subject to currency variations. People who run Australian based companies that rent servers in the US get anxious whenever the US dollar goes up in value.

To quickly investigate the hosting options chosen for various blogs I used the command “traceroute -T -p80” to do SYN traces to port 80 for some of the blogs syndicated on Planet Linux Australia [3]. Of the blogs I checked there were 13 hosted in Australia, 11 hosted independently in the US, and 5 hosted with major US based blog hosting services (WordPress.com, Blogspot, and LiveJournal). While this is a small fraction of the blogs syndicated on that Planet, and blog hosting is also a small fraction of the overall Internet traffic, I think it does give an indication of what choices people are making in terms of hosting.

Currently the Australian government is planning to censor the Internet with the aim of stopping child porn. Their general plan is to spend huge amounts of money filtering HTTP traffic in the hope that pedophiles don’t realise that they can use encrypted email, HTTPS, or even a VPN to transfer files without them getting blocked. If someone wanted to bring serious amounts of data to Australia, getting a tourist to bring back a few terabyte hard disks in their luggage would probably be the easiest and cheapest way to do it. Posting DVDs is also a viable option.

Given that the Internet censorship plan is doomed to failure, it would be best if they could spend the money on something useful. Getting a better Internet infrastructure in the country would be one option to consider. The cost of Internet connection to other countries is determined by the cost of the international cables – which can not be upgraded quickly or cheaply. But even within Australia bandwidth is not as cheap as it could be. If the Telstra monopoly on the local loop was broken and the highest possible ADSL speeds were offered to everyone then it would be a good start towards improving Australia’s Internet access.

Australia and NZ seem to have a unique position on the Internet in terms of being first-world countries that are a long way from the nearest net connections and which therefore have slow net access to the rest of the world. It seems that the development of Content Delivery Network [4] technology could potentially provide more benefits for Australia than for most countries. CDN enabling some common applications (such as WordPress) would not require a huge investment but has the potential to decrease international data transfer while improving the performance for everyone. For example if I could have a WordPress slave server in Australia which directed all writes to my server in Germany and have my DNS server return an IP address for the server which matches the region where the request came from then I could give better performance to the 7% of my blog readers who appear to reside in Australia while decreasing International data transfer by about 300MB per month.

LUV Talk about Cloud Computing

Last week I gave a talk for the Linux Users of Victoria about Cloud Computing and Amazon EC2 [1]. I was a little nervous as I was still frantically typing the notes a matter of minutes before my talk was due to start (which isn’t ideal). But it went well. There were many good questions and the audience seemed very interested. The talk will probably appear on the web (it was recorded and I signed a release form for the video).

Also I have just noticed that Amazon have published a license document for their AMI tools, so I have added the package ec2-ami-tools to my Debian/Lenny repository (for i386 and AMD64) with the following APT sources.list line:

deb http://www.coker.com.au lenny misc

Also the Amazon license [2] might permit adding it to the non-free section of Debian, if so I’m prepared to maintain it in Debian for the next release – but I would prefer that someone who knows Ruby take it over.

EC2 and IP Addresses

One of the exciting things about having a cloud computing service is how to talk to the rest of the world. It’s all very well to have a varying number of machines in various locations, but you need constant DNS names at least (and sometimes constant IP addresses) to do most useful things.

I have previously described how to start an EC2 instance and login to it – which includes discovering it’s IP address [1]. It would not be difficult (in theory at least) to use nsupdate to change DNS records after an instance is started or terminated. One problem is that there is no way of knowing when an instance is undesirably terminated (IE killed by hardware failure) apart from polling ec2-describe-instances so it seems impossible to remove a DNS name before some other EC2 customer gets a dynamic IP address. So it seems that in most cases you will want a constant IP address (which Amazon calls an Elastic IP address) if you care about this possibility. For the case of an orderly shutdown you could have a script remove the DNS record, wait for the timeout period specified by the DNS server (so that all correctly operating DNS caches have purged the record) and then terminate the instance.

One thing that interests me is the possibility of running front-end mail servers on EC2. Mail servers that receive mail from the net can take significant amounts of CPU time and RAM for spam and virus filters. Instead of having the expense of running enough MX servers to sustain the highest possible load even while one of the servers has experienced a hardware failure there is a possibility of running an extra EC2 instance at peak times with the possibility of running a large instance for a peak time when one of the dedicated servers has experienced a problem. The idea of having a mail server die and have someone else’s server take the IP address and receive the mail is too horrible to contemplate, so an Elastic IP address is required.

It is quite OK to have a set of mail servers of which not all servers run all the time (this is why the MX record was introduced to the DNS) so having a server run periodically at periods of high load (one of the benefits of the EC2 service) will not require changes to the DNS. I think it’s reasonably important to minimise the amount of changes to the DNS due to the possibility of accidentally breaking it (which is a real catastrophe) and the possibility of servers caching DNS data for longer than they should. The alternative is to change the MX record to not point to the hostname of the server when the instance is terminated. I will be interested to read comments on this issue.

The command ec2-allocate-address will allocate a public IP address for your use. Once the address is allocated it will cost $0.01 per hour whenever it is unused. There are also commands ec2-describe-addresses (to list all addresses allocated to you), ec2-release-address (to release an allocated address), ec2-associate-address to associate an IP address with a running instance, and ec2-disassociate-address to remove such an association.

The command “ec2-associate-address -i INSTANCE ADDRESS” will associate an IP address with the specified instance (replace INSTANCE with the instance ID – a code starting with “i-” that is returned from ec2-describe-instances. The command “ec2-describe-instances |grep ^INSTANCE|cut -f2” will give you a list of all instance IDs in use – this is handy if your use of EC2 involves only one active instance at a time (all the EC2 API commands give output in tab-separated lists and can be easily manipulated with grep and cut). Associating an IP address with an instance is documented as taking several minutes, while Amazon provides no guarantees or precise figures as to how long the various operations take it seems that assigning an IP address is one of the slower operations. I expect that is due to the requirement for reconfiguring a firewall device (which services dozens or maybe hundreds of nodes) while creating or terminating an instance is an operation that is limited in scope to a single Xen host.

One result that I didn’t expect was that associating an elastic address is that the original address that was assigned to the instance is removed. I had a ssh connection open to an instance when I assigned an elastic address and my connection was broken. It makes sense to remove addresses that aren’t needed (IPv4 addresses are a precious commodity) and further reading of the documentation revealed that this is the documented behavior.

One thing I have not yet investigated is whether assigning an IP address from one instance to another is atomic. Taking a few minutes to assign an IP address is usually no big deal, but having an IP address be unusable for a few minutes while in the process of transitioning between servers would be quite inconvenient. It seems that a reasonably common desire would be to have a small instance running and to then transition the IP address to a large (or high-CPU) instance if the load gets high, having this happen without the users noticing would be a good thing.

Support Gay Marriage in case You Become Gay

A common idea among the less educated people who call themselves “conservative” seems to be that they should oppose tax cuts for themselves and support tax cuts for the rich because they might become rich and they want to prepare for that possibility.

The US census data [1] shows that less than 1% of males aged 15+ earn $250K. For females it’s less than 0.2%.

On the Wikipedia page about homosexuality [2] it is claimed that 2%-7% of the population are gay (and 12% of Norwegians have at least tried it out). Apparently homosexuality can strike suddenly, you never know when a right-wing politician or preacher will suddenly and unexpectedly be compelled to hire gay whores (as Ted Haggard [3] did) or come out of the closet (as Jim Kolbe [4] did).

So it seems that based on percentages you are more likely to become gay than to become rich. So it would be prudent to prepare for that possibility and lobby for gay marriage in case your sexual preference ever changes.

But on a serious note, of the people who earn $250K or more (an income level that has been suggested for higher tax rates) there will be a great correlation between the amount of education and the early start to a career. Go to a good university and earn more than the median income in your first job, and you will be well on track to earning $250K. A common misconception is that someone who has not had a great education can still be successful by starting their own company. While there are a few people who have done that, the vast majority of small companies fail in the first few years. Working hard doesn’t guarantee success, for a company to succeed you need to have the right product at the right time – this often depends on factors that you can’t predict (such as the general state of the economy and any new products released by larger companies).

Basics of EC2

I have previously written about my work packaging the tools to manage Amazon EC2 [1].

First you need to login and create a certificate (you can upload your own certificate – but this is probably only beneficial if you have two EC2 accounts and want to use the same certificate for both). Download the X509 private key file (named pk-X.pem) and the public key (named cert-X.pem). My Debian package of the EC2 API tools will look for the key files in the ~/.ec2 and /etc/ec2 directories and will take the first one it finds by default.

To override the certificate (when using my Debian package) or to just have it work when using the code without my package you set the variables EC2_PRIVATE_KEY and EC2_CERT.

This Amazon page describes some of the basics of setting up the client software and RSA keys [2]. I will describe some of the most important things now:

The command “ec2-add-keypair gsg-keypair > id_rsa-gsg-keypair” creates a new keypair for logging in to an EC2 instance. The public key goes to amazon and the private key can be used by any ssh client to login as root when you creat an instance. To create an instance with that key you use the “-k gsg-keypair” option, so it seems a requirement to use the same working directory for creating all instances. Note that gsg-keypair could be replaced by any other string, if you are doing something really serious with EC2 you might use one account to create instances that are run by different people with different keys. But for most people I think that a single key is all that is required. Strangely they don’t provide a way of getting access to the public key, you have to create an instance and then copy the /root/.ssh/authorized_keys file for that.

This Amazon page describes how to set up sample images [3].

The first thing it describes is the command ec2-describe-images -o self -o amazon which gives a list of all images owned by yourself and all public images owned by Amazon. It’s fairly clear that Amazon doesn’t expect you to use their images. The i386 OS images that they have available are Fedora Core 4 (four configurations with two versions of each) and Fedora 8 (a single configuration with two versions) as well as three other demo images that don’t indicate the version. The AMD64 OS images that they have available are Fedora Core 6 and Fedora Core 8. Obviously if they wanted customers to use their own images (which seems like a really good idea to me) they would provide images of CentOS (or one of the other recompiles of RHEL) and Debian. I have written about why I think that this is a bad idea for security [4], please make sure that you don’t use the ancient Amazon images for anything other than testing!

To test choose an i386 image from Amazon’s list, i386 is best for testing because it allows the cheapest instances (currently $0.10 per hour).

Before launching an instance allow ssh access to it with the command “ec2-authorize default -p 22“. Note that this command permits access for the entire world. There are options to limit access to certain IP address ranges, but at this stage it’s best to focus on getting something working. Of course you don’t want to actually use your first attempt at creating an instance, I think that setting up an instance to run in a secure and reliable manner would require many attempts and tests. As all the storage of the instance is wiped when it terminates (as we aren’t using S3 yet) and you won’t have any secret data online security doesn’t need to be the highest priority.

A sample command to run an instance is “ec2-run-instances ami-2b5fba42 -k gsg-keypair” where ami-2b5fba42 is a public Fedora 8 image available at this moment. This will give output similar to the following:

RESERVATION r-281fc441 999999999999 default
INSTANCE i-0c999999 ami-2b5fba42 pending gsg-keypair 0 m1.small 2008-11-04T06:03:09+0000 us-east-1c aki-a71cf9ce ari-a51cf9cc

The parameter after the word INSTANCE is the serial number of the instance. The command “ec2-describe-instances i-0c999999” will provide information on the instance, once it is running (which may be a few minutes after you request it) you will see output such as the following:

RESERVATION r-281fc441 999999999999 default
INSTANCE i-0c999999 ami-2b5fba42 ec2-10-11-12-13.compute-1.amazonaws.com domU-12-34-56-78-9a-bc.compute-1.internal running gsg-keypair 0 m1.small 2008-11-04T06:03:09+0000 us-east-1c aki-a71cf9ce ari-a51cf9cc

The command “ssh -i id_rsa-gsg-keypair root@ec2-10-11-12-13.compute-1.amazonaws.com” will then grant you root access. The part of the name such as 10-11-12-13 is the public IP address. Naturally you won’t see 10.11.12.13, it will instead be public addresses in the Amazon range – I replaced the addresses to avoid driving bots to their site.

The name domU-12-34-56-78-9a-bc.compute-1.internal is listed in Amazon’s internal DNS and returns the private IP address (in the 10.0.0.0/8 range) which is used for the instance. The instance has no public IP address, all connections (both inbound and outbound) run through some sort of NAT. This shouldn’t be a problem for HTTP, SMTP, and most protocols that are suitable for running on such a service. But for FTP or UDP based services it might be a problem. The part of the name such as12-34-56-78-9a-bc is the MAC address of the eth0 device.

To halt a service you can run shutdown or halt as root in the instance, or run the ec2-terminate-instances command and give it the instance ID that you want to terminate. It seems to me that the best way of terminating an instance would be to run a script that produces a summary of whatver the instance did (you might not want to preserve all the log data, but some summary information would be useful), and give all operations that are in progress time to stop before running halt. A script could run on the management system to launch such an orderly shutdown script on the instance and then uses ec2-terminate-instances if the instance does not terminate quickly enough.

In the near future I will document many aspects of using EC2. This will include dynamic configuration of the host, dynamic DNS, and S3 storage among other things.