3

permalinks in wordpress, Apache redirection, and other blog stuff

When I first put my new blog online I didn’t think to set the custom permalinks option to avoid having /index.php in all URLs (which wastes a few bytes and looks nasty).

So I decided to change to better URLs but unfortunately many people have already bookmarked the bad URLs. I wanted to give a HTTP 301 redirection when someone uses the old index.php version (so that bookmarks get updated) and then redirect to the PHP file. Unfortunately having a redirection from ^/index.php to a version without it and then a local rewrite to include index.php again doesn’t seem to work (any advice would be appreciated). So I put the following in my /etc/wordpress/htaccess file (the location for such things in Debian) so that foo.php is used instead where foo.php is a sym-link to index.php. I’m wondering whether I should file a bug report against the Debian package requesting that a sym-link be in the package to facilitate such things – if it’s not possible to do what I desire without the symlink.

RewriteEngine On
RewriteBase /
#RewriteCond %{REQUEST_URI} ^/index.php/?(.*$) [NC]
#RewriteRule . /%1 [R=301,L]
RewriteCond ^/robots.txt [L]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
#RewriteRule . /foo.php%1 [L]
RewriteRule . /index.php%1 [L]

Update: I am now using the permalink-redirect plugin (thanks for the tip Method) which solves the problem of the obsolete URLs as well as solving the problem of having two representations of the URL (with and without a trailing slash). I have updated the above htaccess file sample to reflect my new configuration (with the old settings commented out for the benefit of people who don’t want the permalink-redirect plugin).

The way WordPress allows the table prefix to be stored in the MySQL configuration section is very handy. Some time ago I asked for advice on a blog server for multiple users and WordPress-MU was recommended, but it seems that for most situations where you want multiple blogs the non-MU version of WordPress will do the job. It seems that the main benefit of WordPress-MU is that setting up multiple blogs doesn’t require running shell scripts, which for the cases I’m most interested in doesn’t compete with the benefit that the non-MU version has of being packaged in Debian.

On the topic of WordPress in Debian, it’s a pity that none of the plugins are packaged in Debian. I plan to create a repository for plugins and themes that I use if no-one else has started such a repository. I believe that a repository of Debian packages for such things will provide significant benefits to users, including updates for security reasons and having plugins that are known to work (some of the plugins appear to only work on Windows).

Also there are a few issues that I would like to improve in WordPress. One is that the Uncategorised category is selected by default so if I select another category and forget to de-select Uncategorised then it’s a little confusing. Another is that the categories are displayed in the side-bar without mentioning the number of matching posts. The way blogger lists the number of posts per category (and sorts the categories in order) is much more convenient. Also another advantage of blogger is the handling of archives where you can click on a month to see a list of the names of all posts in that month. I’m not about to go back, but it would be nice to have those features. Does anyone have any ideas how to solve these problems?

Update2:
I have added a rule to make robots.txt not redirect. Before adding this rule /robots.txt was redirected to /index.php/robots.txt which caused a WordPress page to load, this wasted a lot of bandwidth (robots.txt is hit often) and probably caused some spiders to ignore my site.

7

comment spam

The war on comment-spam has now begun. It appears that Blogger might have some anti-spam measures of which I was unaware. Otherwise it’s a strange coincidence that I get a huge number of comment spams for extremely hard-core porn from the Ukraine so soon after starting a WordPress blog.

About 24 hours before the spam attack there was a strange blog comment that linked to google (with no offensive or spammy content). It appears that leaving it online was my mistake, when I left that online for a day the spammer decided that I might also leave porn spam online. I arrived home this evening to find almost 100 spams in the form of comments and track-backs, and more arriving by the minute. So I used iptables to block a /20 related to the spam and things are quiet now.

The moral of the story is to delete anything unusual ASAP in case it encourages the idiots.

I’ve also tightened the anti-spam measures on my blog too.

Update:

From now on any short comment that does not add significant meaning will not be accepted on my blog. To the person who submitted many dozens of comments with variants of “nice site” with the idea that the URL listed for the comment author will be visited by readers of my site – nice try. If you genuinely want to send me a message saying “nice blog” then email will work.

In the future I may remove the display of URLs for the comment authors entirely.

Several comments suggested using Akismet to block comment spam. Akismet is free for non-commercial use and charges for commercial use (a suggested threshold being $500 per month in blog revenue).

For the moment I am going to moderate all comments, the number of genuine comments is quite small and this is no great effort for me. I check the moderation list at least twice a day so there shouldn’t be an excessive delay either.

3

new blog

I am starting to move my blog to my own WordPress server. Here is the new URL for my main blog (feed), and here is the new URL for my Source-Dump blog (feed) which is now named just “dump”.

WordPress gives me the power to change all aspects of my blog’s operation (including adding plug-ins). It also allows me to correctly display greater-than and less-than characters (the Perl script I use for converting them is at this post – it’s short now but will probably grow).

Hopefully the new blog will also solve the date problems that some Planet readers have been complaining about.

I will briefly put the same content on both the old and new blogs, when I’m fully confident in the new blog I’ll stop updating the old one and try to get all Planet installations changed. Anyone who wants to convert their Planet installation to my new blog now is welcome to do so.

lemonup and blog license

I have just updated my previous post about licenses and also explicitely licensed my blog. Previously I had used a Creative-Commons share-alike license for lecture notes to allow commercial use and had not specified what the license is for my blog apart from it being free for feeds (you may add it to a planet without seeking permission first).

Unfortunately the operators of a site named lemonup.com decided to mirror many of my blog posts with Google AdWords. The site provides no benefit to users that I can discover and merely takes away AdWords revenue from my site. It has no listed method of contacting the site owner so it seems that blogging about this and letting them read it on their own site is the only way of doing so. :-#

I’m happy for Technorati to mirror my site as they provide significant benefits to users and to me personally. I am also happy for planet installations that include my blog among others to have a Google advert on the page (in which case it’s a Google advert for the entire planet not for my blog post).

Also at this time I permit sites to mirror extracts of my articles. So for example the porn blogs that post paragraphs of my posts about topics such as “meeting people” with links to my posts don’t bother me. I’m sure that someone who is searching for porn will not be happy to get links to posts about Debian release parties etc – but that’s their QA issue not a license issue. I am aware that in some jurisdictions I can not prevent people from using extracts of my posts – but I permit this even in jurisdictions where such use is not mandated by law.

Lemonup: you may post short extracts (10% or one paragraph) of my posts with links to the original posts, or you may mirror my posts with no advertising at all. If those options are not of interest to you then please remove all content I wrote from your site.

blogger sucks!

If I enter “a < b” in blogger then it works, but if I want the < symbol to be next to some other text (EG for a #include line in C source) then it treats it as a HTML tag. The HTML code for a < symbol also doesn’t work. This doesn’t work regardless of whether I try entering HTML in the HTML editor or entering text in the “Compose” editor. I could deal with this problem if it forced me to strictly use one of the two editors available, but it fails in both!

Do other blog server programs have problems like this? I think that I need to change my blog server just to allow posting C source!

Also do any of the common blog servers allow a file upload? For most of my posts I do all the editing offline, so I’d rather just upload a HTML file instead of pasting the text into the blog editor (and then manually fixing the situations where it’s idea of formatting differs from mine). If such an upload mode also supported getting a file via HTTP then that would be convenient too. The option of editing a file on a remote server with vi, exporting it via Apache, and then getting it to the blog server via HTTP would work well for me in some important situations.

Update: It sucks more than I thought. The feed misses the “less than” character between “a” and “b” in the above paragraph. My list of requirements in this regard now includes the ability to use such characters in a feed. Actually I want to go the whole hog and be able to include samples of HTML in my blog entries and have them display correctly in the feed.

reviewing blog comments and links

It seems that the swik.net site is mirroring all my blog posts. The site seems to be doing some good things in terms of spreading information about free software and has a good presentation that makes such information easy to read. Also having a backup of my blog posts also could be handy if blogger ever does the wrong thing.

However it is a little annoying that when I write a blog post that refers to one of my older posts it will get a link back to swik.net. This is an annoyance for readers who want to see posts that link to mine from outside my blog. So I’ve been deleting those links when I notice them.

Also someone from Brazil has been linking to my posts, which is a good thing. Their blog also causes my blog to list theirs as a link which is also fine. However the problem is that their blog seems to detect me as being from an English speaking country and gives me an English version of the blog (rather than the presumably Portuguese version that has the link to my article). Assuming that someone speaks English because they reside in Australia is a bad idea, and breaking links is a worse one. So I’ve been deleting those links from my blog as they are of no use to people who are detected as English speakers (which comprises the vast majority of my blog readers). When someone blogs about one of my posts I want to see what they wrote, even if all that I can read are the parts that are quoted from me!

Finally I’ve been deleting some comments containing URLs. It seems that there are quite a few people trying to advertise their businesses by posting comments that bear some vague relation to a blog post with their company’s URL included. You have to try harder than that if you want to promote yourself on my blog.

google reader

From a suggestion on my previous blog entry I decided to test out google reader.

The first problem was that it caused Konqueror to SEGV in etch, I filed a bug report and switched to Firefox.

Next to add my feeds I had to either export them in OPML format or add them one at a time, there is no support for pasting in a list of URLs. If I was writing a RSS syndication program I would also make it parse the config files of some of the common programs, parsing a Planet config file is pretty easy.

I added a feed for a friend who’s server seems to be down. While doing so I tried to add another feed, the google reader accepted the command to add the second feed but didn’t actually do so – it was fortunate that I was pasting it in not typing it…

The killer issue is that it seems to be impossible to merge feeds. I want to read both Planet Linux Australia and Planet Debian, there are some people who are on both planets (EG me). So it makes no sense to do anything other than display both of them in the same view.

At this time it seems that google reader is unsuitable for my use. However it is a fairly slick system and I imagine that it would work quite well for people who have different needs to me. If you want to read the blogs of a few friends then it probably works really well. It just seems not to work well for a set of meshed communities (Debian developers and Linux users in Australia for example).

Please let me know if I somehow missed some configuration options to make google reader do what I want.

planet – resource use

I just noticed that /usr/bin/planetplanet is using about 120M of RAM. This isn’t currently a problem as I’m running it on a machine with 256M of RAM, however I would like to run my web server on a 96M Xen instance. 120M for planetplanet is probably going to cause bad performance on a web server with 96M of physical RAM allocated.

This is a serious problem for me as the Xen server in question can’t be upgraded any more (the motherboard has as much RAM as it can handle).

Are there any other free syndication programs that use less memory?

Are there any good free syndication services that I could use instead of running my own planet?

comment-less blogs

Are comment-less blogs missing the spirit of blogging?

It seems to me that the most significant development about blogging is the idea that anyone can write. Prior to blogs news-papers were the only method of writing topical articles for a mass audience. To be able to write for a news-paper you had to be employed there or get a guest writing spot (not sure how you achieve this but examples are common).

Anyone can start a blog, if there is a community that you are part of which has a planet then it’s not difficult to get your blog syndicated and have some reasonable readership. Even the most popular planets have less readers than most small papers, but that combined with the ease of forwarding articles gives a decent readership.

It seems to me that the major characteristic that separates a blog from an online newspaper is the low entry requirements, anyone can create one.

Every news-paper that is remotely worth reading has a letters column to publish feedback from readers. Of course it’s heavily moderated and getting even 50% of your letters published is something to be proud of. But it does create a limited forum to discuss the articles that are published.

It seems to me that creating a blog and denying the readers the ability to comment on it is in some ways making the blog less open than a news-paper column. When such blogs are aggregated in a community planet feed it seems that they go against the community spirit. It also drives people to make one-line blog posts in response, which I regard as a bad thing.

The comments on my blog are generally of a high quality, I’ve had a few anonymous flame comments – but you have to learn to deal with a few flames if you are going to use the net, and people who are afraid to publish their real name to a flame don’t deserve much attention. I’ve had one comment which might have been an attempt to advertise a product (so I deleted it just to be safe). But apart from that the comments are generally very good. I’ve learned quite a few useful things from blog comments, sometimes I mention having technical problems in blog posts and blog comments provide the solution. Other times they suggest topics for further writing.

There are facilities for moderated blog comments that some people use. If you have a really popular blog then it’s probably a good idea to moderate the comments to avoid spam, but I’m not that popular yet and most people who blog will never be so popular. At this time blog moderation would be more trouble for me than it’s worth.

In conclusion I believe that the web should be about interactive communication in all areas, it should provide a level playing field where the participation of all individuals is limited only by time and ability. Refusing comments on blogs is a small step away from that goal.

what defines a well operating planet?

day 59 of the beard

At OSDC Mary Gardiner gave a talk titled The Planet Feed Reader: Better Living Through Gravity. During the course of the presentation she expressed the opinion that short dialog based blog entries are a sign of a well running planet.

Certainly if blog posts respond to each other then there is a community interaction, and if that is what you desire from a planet then it can be considered a good thing. Mary seemed focussed on planets for internal use rather than for people outside the community which makes the interaction more important.

However I believe that planets are not a direct substitute for mailing lists. On a mailing list you can reply to a message agreeing with it and expect that the same people who saw the original message will see your reply. Blogs however are each syndicated separately so a blog post in response to someone else’s blog should be readable on it’s own. A one line post saying “John is right” provides little value to people who don’t know who John is, especially if you don’t provide a link to John’s post that you agree with.

On Planet Debian there have been a few contentious issues discussed where multiple people posted one-line blog entries. I believe that the effective way to communicate their opinions would either be to write a short essay (maybe 2-3 paragraphs) explaining their opinion and the reasons for it, or if they have no new insight to contribute then they should summarise the discussion.

I believe that a planet such as Planet Debian or Planet Linux Australia should not only be a forum for people who are in the community but also an introduction to the community for people who are outside. AOL posts don’t help in this regard.

One final thing to note is that blogs already do have a feature for allowing “me too” responses, it’s the blog comment facility…

PS Above is a picture of day 59 of the beard, it was taken on the 5th of December (I’ve been a little slack with beard pictures).