One item on my todo list is to set up a bunch of email addresses on sub-domains of domains that I am responsible for (with the consent of all people involved of course) and perform various actions to get the addresses noticed by spammers and measure how effective the various anti-spam measures are. As part of such tests I would click on every URL in every message sent to some accounts and see what difference it makes. My plan is to run a set of Xen virtual machines with different configurations of some common anti-spam measures used in MTAs and see how they fare with sets of accounts with similar publicity. I am not aware of any work having been done in this area (a quick Google search turned up nothing). There are many honeypots for tracking spam sources, matching email address harvesting to spamming, etc. But I’m not aware of any research into the effectiveness of various methods of combatting spam by setting up multiple honeypots. Please inform me via comments if I have missed something!
The most common advice about spam is to NEVER click on the URL that supposedly removes you from a list. By clicking on such a URL the spammer can recognise that you actually read the email and therefore know that it’s a live address and a good target for more spam. I am not aware of any good studies proving this, which is why it’s one of the things I’d like to investigate. A counter theory (for which there is also a lack of evidence AFAIK) is that spammers used to measure delivery etc but now that bot-nets are large and cheap it’s easier to just send mail to all possible addresses.
Even though I am not aware of any great evidence to support the idea I avoid clicking on URLs in spam messages. Refraining from hitting the spam web-sites can’t do any harm (it’s not as if the meager contribution to their system load caused by my web browser will cause them a problem).
But today I was tricked. A spammer subscribed me to a mailman mailing list, as I am subscribed to many lists (about half of which use mailman) the fact that I didn’t recognise the list name didn’t necessarily mean that I hadn’t signed up to it. After signing in I saw the list archives which had only one post concerning spam. I unsubscribed (there was no other reasonable option open to me) and sent the mailman message to SpamCop.
This technique will probably be effective for a while. People will think that they subscribes to a list and forgot about it and that it’s just another list that doesn’t have strong anti-spam measures. That should greatly increase the amount of time taken to black-list the spam server.
So from now on if I receive a spam via a mailing list that I am not familiar with then I’ll send it to SpamCop immediately. Also this is yet another good reason for not subscribing people to mailing lists without their consent (a practice that is far too common – it’s really not difficult to send someone an email asking whether they would like to join the list). If you subscribe me to a list without prior discussion and the first post I receive on the list is a spam then it will be sent to SpamCop and this might result in you being black-listed.
I did a private study once before, and showed that unsubscribing did get you less spam (at least in the short term – 6 weeks). It wasn’t a large study, and was a while ago.
These days of course viewing the images in e-mails is sufficient for the spammer to know an email is being read.
Again type of spam filter may be important, these days I use greylisting everywhere. So I only get spam from places that retry, but wasn’t using that in the study.
For my personal use I’m now using Postfix with policyd-weight from Etch, with Postgrey. Most of the spam in my personal email is forwarded from folks doing less well in the spam filtering than this configuration, which is pretty simple to set up.
I recollect a presentation at ApacheCon EU 2005, from one of the SpamAssassin folks. I was well-impressed with the work he’d done, which included exposing test accounts with different forms of protection, and took due account of accessibility to humans. Should be worth googling for.