Archives

Categories

Web Server Performance

We Have to Make Our Servers Faster

Google have just announced that they have made site speed part of their ranking criteria for search results [1]. This means that we now need to put a lot of effort into making our servers run faster.

I’ve just been using the Page Speed Firefox Plugin [2] (which incidentally requires the Firebug Firefox Plugin [3]) to test my blog.

Image Size

One thing that Page Speed recommends is to specify the width and height of images in the img tag so the browser doesn’t have to change the layout of the window every time it loads a picture. The following script generates the HTML that I’m now using for my blog posts. I run “BASE=http://www.coker.com.au/blogpics/2010 jpeg.sh foo.jpg bar.jpg” and it generates HTML code that merely needs the data for the alt tag to be added. Note that this script relies on a scheme where there are files like foo-big.jpg that have maximum resolution and foo.jpg which has the small version. Anyone with some shell coding skills can change this of course, but I expect that some people will change the naming scheme that they use for new pictures.

#!/bin/bash
set -e
while [ "$1" != "" ]; do
  RES=$(identify $1|cut -f3 -d\ )
  WIDTH=$(echo $RES|cut -f1 -dx)px
  HEIGHT=$(echo $RES|cut -f2 -dx)px
  BIG=$(echo $1 | sed -e s/.jpg/-big.jpg/)
  echo "<a href=\"$BASE/$BIG\"><img src=\"$BASE/$1\" width=\"$WIDTH\" height=\"$HEIGHT\" alt=\"\" /></a>"
  shift
done

Thanks to Brett Pemberton for the tip about using identify from imagemagick to discover the resolution.

Apache and Cache Expiry

Page Speed complained that my static URLs didn’t specify a cache expiry time, this didn’t affect things for my own system as my Squid server forcibly caches some things without being told to but would be a problem for some others. I first ran the command “a2enmod expires ; a2enmod headers” to configure my web server to use the expires and headers Apache modules. Then I created a file named /etc/apache2/conf.d/expires with the following contents:

ExpiresActive On
ExpiresDefault "access plus 1 day"
ExpiresByType image/gif "access plus 1 month"
ExpiresByType image/jpeg "access plus 1 month"
ExpiresByType text/css "access plus 1 day"
# Set up caching on media files for 1 year (forever?)
<FilesMatch "\.(flv|ico|pdf|avi|mov|ppt|doc|mp3|wmv|wav)$">
ExpiresDefault "access plus 1 year"
Header append Cache-Control "public"
</FilesMatch>
# Set up caching on media files for 1 month
<FilesMatch "\.(gif|jpg|jpeg|png|swf)$">
ExpiresDefault "access plus 1 month"
Header append Cache-Control "public"
</FilesMatch>

DNS Lookups

Page Speed complains about DNS names that are used for only one URL. One example of this was the Octofinder service [4], it’s a service to find blogs based on tags, but I don’t seem to get any traffic from it so I just turned it off. In this case it was the only sensible thing to do to have a single URL from their web site, but I had been considering removing the Octofinder link for a while anyway. As an aside I will be interested to see if there are comments from anyone who has found Octofinder to be useful.

I’ve also disabled the widget that used to display my score from Technorati.com, it wasn’t doing what it used to do, the facility of allowing someone to list my blog as a favorite didn’t seem to provide any benefit, and it was taking extra DNS lookups and data transfers. I might put something from Technorati on my blog again in future as they used to be useful.

Cookies

If you have static content (such as images) on a server that uses cookies then the cookie data is sent with every request. This requires transferring more data and breaks caching. So I modified the style-sheet for my theme to reference icons on a different web server, this will supposedly save about 4K of data transfer for a page load while also giving better caching.

The down-side of this is that I have my static content on a different virtual server so now updating my WordPress theme will require updating two servers, this isn’t a problem for the theme (which doesn’t get updated often) but will be a problem if I do it with plugins.

Conclusion

The end result is that my blog now gets a rating of 95% for Page Speed when previously it got a rating of 82%. Now most of the top references that are flagged by Page Speed come from Google, although there is still work for me to do.

Also it seems that Australia is now generally unsuitable for hosting web sites for viewing in other countries. I will advise all my clients who do International business to consider hosting in the US or the EU.

10 comments to Web Server Performance

  • etbe

    nine: It’s affecting about 1% now, but once they have the data they can use it more often and for different purposes. Also it would suck to be in that 1% of the web – and you can’t determine if you are.

    But there’s no harm in making your web site faster anyway, so Google has just given us more incentive to do what we already wanted to do.

  • Interesting stuff, thanks Russell!

    Out of interest were you playing with Page Speed 1.6 or the 1.7 beta ?

    The 1.6 version doesn’t seem to recognise the Cache-Control: headers that your example config sets, which seems a little odd. I’ve verified that the config is working by sending HEAD requests by hand and getting things like:

    Cache-Control: max-age=86400

    and:

    Cache-Control: max-age=2592000, public

    so that looks all good..

  • Josh

    Only 1% for now.. but even that 1% is totally skewed in favour of sites with larger resources (generally speaking).

  • btmorex

    It may not be *that* important for ranking, but you also have to think about users. I know that if I’m going through search results and page doesn’t at least mostly load within 2-3 seconds, I just give up and go to the next result.

  • Hmm, I think you may be over estimating the impact.

    I suspect they need to be conservative because “loading all objects” is an unrealistic metric of page load speed for many sites. I dread to think, but no doubt someone will suggest more asynchronous loading of content via Javascript – web pages will contain no more than “onload=…” and enough text to fool the search engines.

    Think how long it takes to load facebook – without Javascript I get about 6 seconds, 166KB most of which is a 304 response – all but a few Kb changes between each request. I’m guessing this is well within penalty levels.

    So I’d assume this is about penalizing sites that take an inordinate amount of time to load.

    Similarly we load static content from a different site to encourage better threading behaviour. Now how does that affect the ranking of the site with the dynamic content?

    On the upside if the feedback on the webmaster blog persuades Google to sort out the performance of their Analytics server – it might be beneficial. It can be very annoying after you’ve spent ages making a site load a lot of content as quickly as possible to discover the users perceived load time is slow because of the analytics.

  • foo

    Apropos performance…

    identify -format ‘\n’ $1

  • re: cookie-less server:
    why not just create a VirtualHost (in your Apache setup), on the same machine and put the document root at your images/media folder?
    This way there’s no extra overhead when updating the template etc.

  • Rodney

    Actually, I totally forgot about my little Octofinder thing until I checked my google analytics and saw they sent me ~1,500 visitors last month and about a thousand the month before.

  • etbe

    Rodney: Strangely Octofinder gave me almost no hits (not in the top 60 referrers – so less than 23 hits per month).

    If you can offer some startling tips on how to get Octofinder to do some good then I’ll be interested to read it and will consider giving Octofinder another go. But I’m not going to enable it again only to get almost no traffic.