Linux, politics, and other interesting things
I’ve just read Diego’s response to an ill-informed NYT article about data-center power efficiency . This makes me wonder, how much server use does each person have?
Almost everyone uses Google, most of us use it a lot. The main Google product is also probably the most demanding, their search engine.
In a typical day I probably do about 50 to 100 Google searches, that sounds like a lot, but half of them would probably be for one topic that is difficult to find. I don’t think that I do that many Google searches because I generally know what I’m doing and when I find what I need I spend a lot of time reading it. I’m sure that many people do a lot more.
Each Google search takes a few seconds to complete (or maybe more if it’s an image search and I’m on a slow link), but I think it’s safe to assume that more than a few seconds of CPU time are involved. How much work would each Google search take if performed on a single system? Presumably Google uses the RAM of many systems as cache which gives a result more similar to a NUMA system than one regular server working for a longer time so there is no way of asking how long it would take to do a Google search with a single server. But I’m sure that Google has some ratio of servers to the rate of requests coming in, it’s almost certainly a great secret, but we can make some guesses. If the main Google user base comprises people who collectively do an average of 100 searches per day then we can probably guess at the amount of server use required for each search based on the number of servers Google would run. I think it’s safe to assume that Google doesn’t plan to buy one server for every person on the planet and that they want to have users significantly outnumbering servers. So even for core users they should be aiming to have each user only take a fraction of the resources that one server adds to the pool.
So 100 searches probably each take more than 1 second of server use. But they almost certainly take a lot less than 864 seconds (the server use if Google had one server for every 100 daily requests which would imply one server for each of the heavier users). Maybe it takes 10 seconds of server use (CPU, disk, or network – whichever is the bottleneck) to complete one search request. That would mean that if the Google network was at 50% utilisation on average then they would have 86400*.5/10/100 == 43 users per server for the core user base who average 100 daily requests. If there are 80M core users that would be about 2M servers, and then maybe something like another 4M servers for the rest of the world.
So I could be using 1000 seconds of server time per day on Google searches. I also have a Gmail account which probably uses a few seconds for storing email and giving it to Fetchmail, and I have a bunch of Android devices which use Google calendars, play store, etc. The total Google server use on my behalf for everything other than search is probably a rounding error.
But I could be out by an order of magnitude, if it only took 1 second of server use for a Google search then I would be at 100 server seconds per day and Google would only need one server for every 430 users like me.
Google also serves lots of adverts on web sites that I visit, I presume that serving the adverts doesn’t take much resources by Google standards. But accounting for it, paying the people who host content, and detecting fraud probably takes some significant resources.
There are many people who spend hours per day using services such as Facebook. No matter how I try to estimate the server requirements it’s probably going to be fairly wrong. But I’ll make a guess at a minute of server time per hour. So someone who averages 3 hours of social networking per day (which probably isn’t that uncommon) would be using 180 seconds of server time.
The server that hosts my blog is reasonably powerful and has two other people as core users. So that could count as 33% of a fairly powerful server in my name. But if we are counting server use per USER then most of the resources of my blog server would be divided among the readers. My blog has about 10,000 people casually reading it through Planet syndication, that could mean that each person who casually reads my blog has 1/30,000 of a server allocated to them for that. Another way of considering it is that 10% of a server (8640 seconds) is covered by me maintaining my blog and writing posts, 20% is for users who visit my blog directly, and 3% is for the users who just see a Planet feed. That would mean that a Planet reader gets 1/330,000 of a server (250ms per day) and someone who reads directly gets 1/50,000 of a server (1.72s per day) as I have about 10,000 people visiting my blog directly in a month.
My mail server which is also shared by a dozen or so people (maybe that counts as 5% of a server for me or 4320 seconds per day). Then there’s the server I use for SE Linux development (including my Play Machine) and a server I use as a DNS secondary and a shell server for various testing and proxying.
If every reader of a Planet instance like Planet Debian and Planet Linux Australia counts as 1/330,000 of a server for their usage of my blog, then how would that count for my own use of blogs? I tend to read blogs written by the type of people who like to run things themselves, so there would be a lot of fairly under-utilised servers that run blogs. Through Planet Debian and Planet Linux Australia I could be reading 100 or more blogs which are run in the same manner as mine, and in a typical day I probably directly visit a dozen blogs that are run in such a manner. This could give me 50 seconds of server time for blog reading.
I have a file server at home which is also a desktop system for my wife. In terms of buying and running systems that doesn’t count as an extra server as she needs to have a desktop system anyway and using bigger disks doesn’t make much difference to the power use (7W is the difference between a RAID-1 server and a single disk desktop system). I also have a PC running as an Internet gateway and firewall.
Running servers at home isn’t making that much of an impact on my computer power use as there is only one dedicated 24*7 server and that is reasonably low power. But having two desktop systems on 24*7 is a significant factor.
No matter how things are counted or what numbers we make up it seems clear that having a desktop system running 24*7 is the biggest use of power that will be assigned to one person. Making PCs more energy efficient through better hardware design and better OS support for suspending would be the best way of saving energy. Nothing that can be done at the server side can compare.
Running a server that is only really used by three people is a significant waste by the standards of the NYT article. Of course the thing is that Hetzner is really cheap (and I’m not contributing any money) so there isn’t a great incentive to be more efficient in this regard. Even if I allocate some portion of the server use to blog readers then there’s still a significant portion that has to be assigned to me for my choice to not use a managed service. Running a mail server for a small number of users and running a DNS server and a SE Linux development server are all ways of wasting more power. But the vast majority of the population don’t have the skills to run their own server directly, so this sort of use doesn’t affect the average power use for the population.
Nothing else really matters. No matter what Google does in terms of power use it just doesn’t matter when compared to all the desktop systems running 24*7. Small companies may be less efficient, but that will be due to issues of how to share servers among more people and the fact that below a certain limit you can’t save money by using less resources – particularly if you pay people to develop software.
I blame Intel for most of the power waste. Android phones and tablets can do some amazing things, which is hardly surprising as by almost every measure they are more powerful than the desktop systems we were all using 10 years ago and by many measures they beat desktop systems from 5 years ago. The same technology should be available in affordable desktop systems.
I’d like to have a desktop system running Debian based on a multi-core ARM CPU that can drive a monitor at better than FullHD resolution and which uses so little power that it is passively cooled almost all the time. A 64bit ARM system with 8G of RAM a GPU that can decode video (with full Linux driver support) and a fast SSD should compete well enough with typical desktop systems on performance while being quiet, reliable, and energy efficient.
Finally please note that most of this post relies on just making stuff up. I don’t think that this is wrong given the NYT article that started this. I also think that my estimates are good enough to draw some sensible conclusions.