Archives

Categories

Why Cyrus Sucks

I’m in the middle of migrating a mail server away from the Cyrus mail store [1]. Cyrus provides a POP and IMAP server, a local delivery agent (accepting mail via LMTP). It is widely believed that Cyrus will give better performance than other mail stores, but according to a review by linux-magazin.de Dovecot and Courier deliver comparable (and sometimes better) performance [2].

The biggest problem with Cyrus is that it is totally incompatible with the Unix way. This wouldn’t be a problem if it would just work and if it would display reasonable error messages when it failed, but it doesn’t. It often refuses to work as desired, gives no good explanation, and it’s data structures can’t be easily manipulated. Dovecot [3] and Courier [4] use the Maildir++ format [5] (as well as many other programs). I have set up a system with Courier Maildrop and Dovecot for the IMAP server [6] and it works well – it’s good to have a choice! But also Maildir++ is reasonably well documented and is an extension to the well known Maildir format. This means that it’s easy to manipulate things if necessary, I can use mv to rename folders and rm to remove them.

Cyrus starts with a database (Berkeley DB file) of all folders in all mailboxes. Therefore it is not possible to move a user from one back-end server to another by merely copying all the files across and changing the LDAP (or whatever else contains the primary authentication data) to point to the new one. It also makes it impossible to add or remove folders by using maildirmake or rm -rf. The defined way of creating, deleting, and modifying mailboxes is through IMAP. One of the problems with this is that copying a mailbox from one server to another requires writing a program to open IMAP connections to both servers at once (tar piped through netcat is much faster and easier). Also if you need to rename a mailbox that contains many gigabytes of mail then it will be a time consuming process (as opposed to a fraction of a second for mv).

Cyrus has a tendency to break while Dovecot is documented as being self-healing and Cyrus also seems to cope well in the fact of a corrupted mail store. Even manually repairing problems with Cyrus is a painful exercise. The Cyrus mail store is also badly designed – and it’s design was worse for older filesystems (which were common when it was first released) than it is for modern ones. The top level of a Cyrus maildir contains all the messages in the INBOX stored one per file, as well as three files containing Cyrus indexes and sub-directories for each of the sub-folders. So if I want to discover what sub-folders a mailbox has then I can run ls and wait for it to stat every file in the directory or I can use an IMAP client (which takes more configuration time). As opposed to a Maildir++ store where every file that contains a message is stored in a folder subdirectory named “new“, “cur“, or “tmp” which means that I can run ls on the main directory of the mail store and get a short (and quick) result. Using tools such as ls to investigate the operation of a server is standard practice for a sysadmin, it should work well!

A finall disadvantage of Cyrus seems to have many small and annoying bugs (such as the reconstruct program not correctly recursing the sub folders). I guess it’s because not many people use Cyrus that such things don’t get fixed.

One trivial advantage of Cyrus is that by default it splits users into different sub-directories for the first letter of the account name. Dovecot supports using a hash of the user-name this is better than splitting by first-letter for performance (it gives a more equal distribution) but will make it slightly more difficult to manipulate the mail store by script. Ext3 can give decent performance without a two level directory structure for as many as 31,998 sub-directories (the maximum that it will support) due to directory indexing and Linux caching of dentries. There may be some other advantages of Cyrus, but I can’t think of them at the moment.

Here is a script I wrote to convert Cyrus mail boxes to Maildir++. To make this usable for a different site would require substituting a different domain name for example.com (or writing extra code to handle multiple domains) and inserting commands to modify a database or directory with the new server name. There is no chance of directly using this script on another system, but it should give some ideas for people performing similar tasks.

#!/bin/bash -e
# $1 is the user to move

# BASEDIR is the directory for temporary files
BASEDIR=/mail/new
rm -rf $BASEDIR/$1 archive.tgz archive2.tgz

ORIGSRV=$(echo "MYSQL_COMMAND_TO_GET_SERVER" | mysql dbname)
echo "original server was $ORIGSRV"

if [ "$ORIGSRV" = "" ]; then
  echo "No server for account, strange error."
  exit 1
fi

ORIGPARENT="/mail/domain/e/example.com/$(echo $1 | cut -c1)/user"

# copy the data to a local archive
# USERDIR is for using ^ instead of . as Cyrus does.
mkdir -p $BASEDIR
cd $BASEDIR
USERDIR=$(echo $1|sed -e "s/\./^/g")
ssh cyrus@$ORIGSRV "(cd $ORIGPARENT ; tar czf – $USERDIR)" > archive.tgz
tar xzf archive.tgz
if [ "$USERDIR" != "$1" ]; then
  mv $USERDIR $1
fi

DEPTH=$(find . -mindepth 3 -type d | wc -l)
if [ "$DEPTH" != "0" ]; then
  echo "too deep"
  exit 1
fi

# remove cyrus index files
find $1 -name "cyrus.*" -exec rm {} \;
cd $1

# "mail" is a directory with sub-folders, every other name starting with
# [a-zA-Z] is a regular folder that needs to become a Maildir
for n in [a-zA-Z]* ; do
  if [ "$n" != "mail" ]; then
    cd "$n" ; mkdir new cur tmp ; touch maildirfolder
    mv [0-9]* new || true
    cd .. ; mv "$n" ".$n"
  fi
done

# move the inbox files to Maildir storage
mkdir new cur tmp
mv [0-9]* new || true
mv mail/[0-9]* cur || true

# move sub-folders of "mail/" and merge ones with duplicate names
cd mail
for n in * ; do
  if [ -e "../.$n" ]; then
    cd "$n"
    mv [0-9]* "../../.$n/cur" || true
    cd .. ; rmdir "$n"
  else
    cd "$n" ; mkdir new cur tmp ; touch maildirfolder
    mv [0-9]* new || true
    cd .. ; mv "$n" "../.$n"
  fi
done
cd ..
rmdir mail

# get a list of all names so we can recognise duplicates
find . -type d | egrep -v "new|cur|tmp"

cd $BASEDIR

ssh root@localhost "mv /mail/new/$1 /mail/example.com ; chown -R mail:mail /mail/example.com/$1"
echo "MYSQL_COMMAND_TO_SET_SERVER" | mysql dbname

4 comments to Why Cyrus Sucks

  • Anonymous

    Cyrus has another major bug as well. It inherently trusts the Message-Id of mails, and doesn’t save mails with the same Message-Id. In theory this tries to avoid duplicates. In practice, this throws away mail when it shouldn’t. Obviously some mails do have duplicate Message-Id fields, and even “copies” of the same message with the same Message-Id may have headers which differ in an important way (such as the presence or absence of mailing-list headers useful for filtering).

  • Hi, Russel. I don’t like Cyrus either, but a friend of mine have made a manager of cyrus/ldap, called korreio: http://korreio.sourceforge.net

    Have you try that ?

  • Jobst

    Glaubst Du dass all Leser hier Deutsch lesen koennen????

    http://www.linux-magazin.de/heft_abo/ausgaben/2007/06/auf_der_teststrecke/

    But yes, Cyrus sucks: awkward to setup, hog, big, clumsy and so/too **many** features.

    I have to agree with the guys that Dovecot has its place (I use it), its fast, lightweight and stable and WELL documented. I use it in a setup where a variety of accesses are required (mobile phones, outlook, thunderbird, squirrelmail, internal and DMZ access) and it never had a problem … I only once had to kill a dovecot process in 5 years using it in 3 different places … and even then I was too lazy not trying to find the sub-process that stopped one user reading email!

  • etbe

    http://www.google.com/translate_t?hl=en

    Jobst: In regard to your comment in German (translation “Do you think that all readers here can read German”), I think that all readers here can use Google translation (see the above URL).

    Also when citing URLs I try and capture the essential point that I am making with my own text so that my post will stand alone without the references. Periodically web sites change and links get broken, my post should still make sense even if the referenced pages go away.

    You could probably even just look at the graphs in the Linux Magazin article to get the point.