Archives

Categories

putting HTML codes and other special characters into a blog entry

I often want to write blog posts about HTML code and about source code in various languages. One problem I have is that the characters I want to use have special meanings (EG < and >), another is that I indent source code to make it readable and I don’t want the spaces trimmed from the start of lines.

I initially wrote a simple Perl script to replace characters such as < with HTML codes. I then had to extend it to escaping quote characters because WordPress tries to get smart and change quotes in a way that might look nice when dealing with plain text, but is just a pain when dealing with code.

The next problem I had is that when I used the <PRE> tag around some text to preserve the white-space WordPress would double-space the text (IE insert a blank line between every two lines of code). This was annoying when reading it and in some situations would change the meaning of the code! The solution I have found to these problems is to use the below script and not use the <PRE> tag. Also I tried using the <CODE> tag, but it made no difference to the end result as far as I could see.

The below script is what I am currently using. It is working well with shell scripts, HTML, and XML so far.

Update: The way that -- is munged by WordPress to is something that I find particularly annoying. I already had this in the script but forgot to mention it in the post.

#!/usr/bin/perl

while(<STDIN>)
{
  s/&/&amp;/g;
  s/</&lt;/g;
  s/>/&gt;/g;
  s/"/&#34;/g;
  s/'/&#39;/g;
  s/`/&#96;/g;
  s/--/&#45;-/g;
  s/  /&nbsp; /g;
  print;
}

4 comments to putting HTML codes and other special characters into a blog entry