Linux, politics, and other interesting things
Amazon is providing free EC2 access for new customers (who have never been customers before) for one year . It is 750 hours per month (enough to run non-stop for an entire month) of access to a Linux micro instance which has 613M of RAM and the ability to burst to two ECUs of compute power. The main EC2 web page  describes an ECU as “the equivalent CPU capacity of a 1.0-1.2 GHz 2007 Opteron or 2007 Xeon processor” and they also describe a single core of a modern CPU as having 3.5 ECUs. So a micro instance could burst to half the CPU power of a single core. The DomU that runs my blog (as well as some web sites for friends) has been averaging less than 1% use of a CPU core over the last few months, so the CPU capacity of a micro instance should be more than adequate for most things that run on the net.
The free offering only provides 15G of free data transmission and 15G of free data reception per month. For my blog server that would be more than adequate as it has sent 24.5G and received 14G over the last 75 days.
The offer requires that you sign up with a credit card so if you use more than the free capacity then you have to start paying. It seems that the main issue in this regard is disk IO.
The only storage that is available for a micro instance is the Elastic Block Store (EBS) . The main way that EC2 operates is that when you create a new virtual machine it copies the data from an existing image so you can easily create dozens or hundreds of virtual machines with local disk performance – and the data is removed when the instance is shut down. EBS is essentially SAN based storage, it’s persistent and operates like a regular disk.
The pricing for EBS is $0.10 per allocated GB per month plus $0.10 per million IOs. Unfortunately they don’t define what an IO is apart from mentioning that you can use IOSTAT to measure them. According to iostat the server running my blog is doing 0.99 tps, so that means in a 30 day month I would expect 30*24*3600*0.99 = 2.56M transactions. Iostat also tells me that my blog server has read 62075330 blocks and written 91683344 blocks over the last 75 days 16 hours of uptime, that means it would do about (62075330+91683344)/75.66*30 = 61M block transfers in a 30 day month. So if I was to run my blog server on EC2 I could be spending either $0.15 or $6.00 per month on disk IO depending on how they count it (or maybe something in between, something larger than a 512 byte block but smaller than a “transaction” as reported by iostat could be used). Given that the last time I checked the prices one could rent a DomU for less than $6 per month  the difference in possible ways of measuring IOs is very significant!
The MySQL server that is the backend to my blog (as well as a few other things) seems to be averaging about 3 writes per second (and no reads during operation because the databases are small). So it might be another 5 million IOs per month for the database.
It’s unfortunate that Amazon haven’t clearly specified what they mean when measuring IO for billing purposes. Some aspects of measurement such as whether the bills for bandwidth include Ethernet headers can be ignored as a 26 byte Ethernet header won’t make much difference to the bill when the average packet size is around 400 bytes or more (from ifconfig output it seems that my blog server sends packets of an average size of 459 bytes and receives packets of an average size of 1250 bytes). But the methods of measuring disk IO could give a factor of 20 difference in the bill.
If I was going to put my blog on EC2 then I would start by configuring Apache to log to a fifo and then write a daemon that stores the log data and allows my home server to poll it and get the log data. As the filesystem is already mounted with noatime it seems that writing logs is the cause of all the disk writes so if they were stored in RAM (which shouldn’t be a problem with 4M of logs per day) then all those writes could be avoided. Another possible solution to this would be to make /var/log be a tmpfs and then rsync the files periodically to my home server. I don’t really need to have all the logs remain on the server I just need them to remain somewhere.
Amazon also offers 100,000 messages on their Simple Queue Service (SQS) for free . The messages can be up to 8K in size and are stored for up to 4 days. So it seems possible to put Apache logs into SQS messages in bundles of less than 8K and then get them out later for transfer to a server outside EC2.
If I was able to get my disk writes to almost zero then there’s a good chance that I could get into the free zone for one year.
Would I use this service? If I was looking for new hosting for my blog then I would seriously consider it. EC2 is quite fast and well connected and depending on how they work out the billing for disk IO I could probably keep the cost close to zero.
EC2 is a different way of running things so you can’t just have a virtual server running and expect it to automatically restart if it goes down for any reason (a standard feature of virtual hosting companies). Amazon does have a range of tools for managing EC2 instances and they all seem to be available in the free trial. So after spending the time to learn those tools the result should be good.
I think that there are two groups of people who could benefit from using this. One is hobbyists, this is a great way to learn some skills related to high-end server stuff and EC2 experience should look good on a CV. The other is companies who want to use EC2 anyway and who will just save some money that they would otherwise pay. I’ve seen someone recommend the free offering from EC2 for a company that needed a small server, I think that isn’t a good option as a company that only wants a single small server will be better off paying something between $5 and $20 per month for a DomU from one of the virtual hosting providers.
After a year you have to pay regular prices. A micro instance costs $0.02 per hour which is $14.40 per month, SQS costs $0.10 per month for sending up to 1G of data in and at $0.01 per 10,000 SQS requests would costs $0.03 (the 4M of log data I generate per day would be 1000 requests to write and read it which would be 30,000 requests per month), the EBS for MySQL would cost $0.10 for 1G of storage and maybe $0.50 for IOs. That means $15.13 before counting bandwidth.
My blog server averages just under 10G of transmitted data per month, the first Gig is free so that would cost $0.15 for each subsequent Gig which is $1.35 per month. It receives just under 6G per month which at $0.10 per gig would be $0.60. So including data transfer it would be about $17.08 per month.
This is a lot more expensive than some of the cheaper virtual server offerings but admittedly the cheaper virtual offerings don’t have as much RAM. Also with a blog instance running on EC2 I could easily configure it so that I could create some big instances that use the same MySQL database if a lot of extra traffic suddenly started arriving. A micro instance running MySQL on it’s own could cope with a heap of load a lot more easily than the PHP code for my blog. So using bigger servers to run the PHP code while running MySQL on the same server would be a good option – particularly if the bigger servers use caching.
Finally if you want to run an EC2 instance for a year then you can get a reserved instance, you pay $54 per annum and the cost drops to $0.007 per hour instead of $0.02 per hour. Using a reserved instance for my blog would give a cost of $54+365*24*0.007+12*(0.10+0.03+0.10+0.50) or about $124.08 per annum. $10 per month isn’t too bad. So if I migrated my blog to EC2 then I would probably keep it there after the free period expired. The ability to expand rapidly when necessary is worth paying extra. Of course I am making some assumptions such as that the performance of a micro instance doesn’t totally suck – as Amazon don’t specify what bursting to 2ECU really means it could have some performance problems.
Note that all prices in this post are in US Dollars.