Rumour has it that some types of spammer target the secondary MX servers. The concept is that some people have less control over the secondary MX server and less ability to implement anti-spam measures. Therefore if they accept all mail from the secoondary then a spammer will have more success if they attack the secondary server.
True secondary servers are becoming increasingly uncommon, the lower priority servers listed in MX records tend to have the same configuration as the primary, so the benefit for the spammer in attacking the secondary server is probably minimal. But it would be good to know whether they do this.
I decided to analyse the logs from a mail server that I run to see if I can find evidence of this. I chose a server that I run for a client which has thousands of accounts and tens of thousands of messages delivered per day, my own server doesn’t get enugh traffic to give good results.
I analysed the logs for a week for the primary and secondary MX servers to see if the ratio of spam to ham differed. Now this does have some inherent inaccuracy, some spam will slip past the filters and occasionally a legitimate email will be rejected. But I believe that the accuracy required in a spam filter to avoid making the users scream is vastly greater than that which is required to give a noteworthy result.
I produced totals of the number of messages delivered, the number rejected by SpamAssassin (which has a number of proprietary additions), the number of message delivery attempts that were prevented due to rate limiting (most of which will be due to spammers), and the number of attempts to deliver to unknown accounts (some of which will be due to spammers having bad addresses in their lists).
For each of these rejection criteria I produced a ratio of the number of rejections to the number of delivered messages for each of the servers.
The rate limit count didn’t seem useful. While the primary server had a ratio of 0.75 messages rejected due to rate limiting to every message accepted the secondary had a ratio of 0.08. It seems that the secondary just didn’t get enough traffic to trigger the limits very often. This is an indication that the more aggressive bots might not be targetting the secondary.
The ratio of messages rejected by SpamAssassin to legitimate mail was 0.76:1 on the primary server and on the secondary server it was 1.24:1. The ratio of messages addressed to unknown users to successful deliveries was 3.05:1 on the primary and 7.00:1 on the secondary! This seems like strong evidence to show that some spammers are deliberately targetting the secondary server.
In this case both the primary and secondary servers are in server rooms hosted by the same major ISP in the same region. The traceroute between the two mail servers is only 7 hops, and there is only one hop between the two server rooms. So it seems unlikely that there would be some connectivity issue that prevents spammers from connecting to the primary.
One other factor that may be relevant is that the secondary server has been in service for some years while the primary is only a few months old. Spammers who store the server IP address with the email address (which happens – change the DNS records to send your mail to a different server and you will see some spam go to the old server) will be sending mail to what is now the secondary server. The difference in the rejected mail volume on the secondary server and the amount that would be rejected if it had the same ratio as the primary amounts to 7% of all mail rejected by SpamAssassin and 14% of all mail addressed to unknown users. I think it’s unlikely that any significant fraction of that would be due to spammers caching the server IP address for months after the DNS records were changed. So therefore it seems most likely that something between 7% and 14% of spam is specifically targetted at the secondary server.
While the ratio of spam to ham seems significantly worse on the secondary it is still a relatively small portion of the overall spam sent to the service. I had been considering setting up secondary mail servers with extra-strict anti-spam measures but the small portion of the overall spam that is targetted in such a way indicates to me that it is not going to be worth the effort.
Another thing that has occurred to me (which I have not yet had time to investigate) is the possibility that some spammers will send the same messages to all MX servers. If that happens then the ratio of spam to ham would increase every time the number of MX servers is increased. In that case it would make sense to minimise the number of MX servers to reduce the amount of CPU power devoted to runing SpamAssassin.
Note that I have intentionally not given any numbers for the amount of mail received by the service as it is a commercial secret.
Update: One thing I realised after publishing this post is that the secondary MX server is also the main server for mail sent between local users. While the number of users who send mail to other users on the service is probably a small portion of the overall traffic (it’s not a really big ISP) it will make a difference to the ratios. Therefore the ratio of spam to ham would be even worse on the secondary MX (assuming for the sake of discussion that local users aren’t spamming each other).
It’s worth considering that by definition the secondary MX server should be getting no email at all whilst the primary MX is up and receiving mails properly. Therefore the mails coming in to the secondary must be from either A) badly written MTAs which don’t follow the RFC simply because of poor design, B) servers which had a network problem trying to reach the primary, or C) spammers, deliberately breaking the RFC.
My gut feeling (or professional hunch, whatever you like) is that those three choices are in a rising order of frequency. You’ll see proportionally more spam on a secondary because the spammers are simply trying to send as much mail as possible through as many avenues as possible, RFCs be damned.
I’ve been doing greylisting on our secondary MX (not the primary) for several years now, with great results. It stops a lot of spam for us, and by using greylisting on the secondary only, legit mail doesn’t get delayed by the greylisting.
Almost all of the incoming mail on our secondary MX is spam, and for those rare times when the primary MX is unavailable and legit mail gets sent to the secondary MX, the mail delivery is delayed anyway, so some extra delay from greylisting doesn’t hurt.
Rob: There are a lot of transient issues. For example if I send mail from my own personal mail server (which is less reliable) then it might have a transient outage while trying to contact the primary server but then succeed when talking to the secondary server. So even though the primary server was online all the time some traffic goes to the secondary. Then there’s the issue of overload, I have a limit to the number of TCP connections that the primary will accept, if that is reached then clients will connect to the secondary.
Your guess as to the relative frequency of A, B, and C seems reasonable and matches my observation of the data.
Bart: That’s an interesting idea. I’ll suggest that to my client, he’s very much against greylisting but he might like that way of doing it! Thanks.