Recent reports indicate that spam is increasing again. I have been using Exim to filter spam for several years. Some recent tuning I have done has decreased the percent of spam that reaches my spam filters. This article provides a discussion of the techniques used and provides implementation examples. Spambots tend to be simple programs that don’t handle slow servers very well. Using a greylist is an effective method of blocking them as they usually don’t retry. My latest changes use delays to cause many spambots to abandon their attempt. Greylisting is used only for poorly configured servers that make it to the Recipient command.
Configuration Modifications
I use the Debian split configuration on an Ubuntu system. This makes it easy to add ACLs and supporting configuration. The changes do not alter the standard configuration and can be easily altered or disabled if required. The example ACLs provided here have been edited to remove some local logging code.
The split configuration uses sub-directories under /etc/exim/conf.d
for the various sections of the configuration files. The attached files have a header with a suggested name prefixed with the sub-directory it belongs to. Files in these directories are processed in alpha-numeric order. Comments are stripped when the final configuration is generated, so comment liberally.
I use the file main/00_localmacros. Among other things, this includes the Internet IP and addresses for the server. This file is processed very early so you can add additional overrides to this file. Omit the ACL specifications if you do not provide the implementation files.
The RFCs indicate that servers should expect servers to delay responses for significant periods. In most cases, they are required to handle delays of up to 5 minutes. Only in handling Data are minimum timeouts lower. I define short, standard, and long delays to handle various delays. My configuration uses cumulative delays so the long delay is limited accordingly.
ACLs (Access Control Lists)
Exim uses ACLs to determine if the email is to be accepted. The ACLs described below are optional additions to the standard Debian/Ubuntu configuration. If they are not provided, the default action is to accept. We use three ACL result codes:
- accept (return a success response);
- deny (return a permanent error); and
- defer (return a temporary error).
Except for the connect ACL, the ACLs described here are run after the command is received, and determine the response to the command. The connect ACL is run before the banner message is given, and determine the status of the connect response.
The ACLs run at connect time and before the HELO/EHLO command apply to all SMTP connections. We use conditions at the head of the ACL to exempt locally trusted connections, and MUAs using the Submission port.
The mail ACL is a replacement for the existing ACL. The default mail ACL consists of an optional check to ensure that a HELO was done first. This check is included in our configuration.
The recipient and data ACLs described here are run by the existing ACLs for corresponding commands. The existing ACLs provide basic checks. The data ACL handles all incoming messages regardless of the source.
Connection Delays
The acl/25_local-config_check_connect file contains a connect time ACL. The banner response is delayed until the ACL returns a result. The delay includes all triggered delays and the time required by the DNS lookups and other processing. With the delays specified in our configuration, they should rarely exceed a minute. More than 40% of servers listed with Spamhaus abandon their connections at this point.
A check is done to verify the address has the appropriate PTR and A records for a mail server. If the correct DNS entries are missing, a short delay is applied and pipelining is denied. If the DNS lookups can’t be done the connection is deferred after a short delay. 🙁 Unfortunately, some legitimate servers don’t respect the timeout requirements and have poorly configured DNS records. As these are mostly marketing mail servers, you may want to risk some of this mail by increasing the delay.
The primary action used is to apply a long delay for servers that are listed in a couple of trusted DNS blacklists. This delay is repeated in most ACLs. Servers listed in a DNS whitelist run by dnswl.org are exempted from the delay. We also add a long delay to servers that are locally blacklisted.
EHLO / HELO Checks
The acl/25_local-config_check_helo file contains an ACL for the HELO command. Once we have received the remote server’s identification, we can begin to verify that it is legitimate or at least properly configured. The HELO banner is easily forged, and the checks here are designed to block many of these forgeries. The ACL enforces these checks:
- The sender identity must be a fully qualified domain name.
- The sender identity must not be a local identity.
- Tee sender identity must not be an IP address.
- The sender identity must not be a domain literal. (Added because we have enabled Domain Literals for statistical purposes.)
- If an SPF record exists for the server, the server must be approved by the SPF record, or the SPF return must be neutral.
As much as I would like to verify that a valid domain is used, I receive too many valid emails with invalid domains. Unfortunately, these are servers so poorly configured that they do not know their own Internet identity. I am continuing to research this case and will likely begin conditionally blocking servers.
Long delays are applied after all failures. Unaccepted SPF conditions other than the fail condition include short delays.
MAIL Checks
More than 75% of the servers listed with Spamhaus abandon the connection before sending a mail command.
The mail command provided an opportunity to check for forged sender identities. Like the HELO identity, this is also easily forged. The acl/30_local-config_check_mail file contains an ACL to be applied to the MAIL command. This ACL enforces these checks:
- A HELO command must be issued before the MAIL command.
- The sender must not use a local domain unless the server hosts an approved mailing list.
The default configuration does sender checks in the recipient ACL. These checks include:
- Optionally, verify that the sender address has a domain to which email can be routed;
- Optionally, verify that the sender’s address can receive mail; and
- Optionally, verify that the sender address is permitted by SPF.
Due to information forgery by legitimate organizations, several checks are difficult to use without whitelists. Some of these are conditionally being tested using the freeze control. This causes the email to be held in the delivery queue if it is accepted. These rules, which may initially require significant administration time, include:
- The HELO identity does not have a valid domain at the second level (example.com rather than com).
- The sender address has a domain part to which email can not be routed.
- The sender is not permitted by SPF policy.
Recipient checks
The 30_local-config_check-rcpt file contains local additions to the ACL for the Recipient command. The recipient command is issued once for each recipient. Our ACL is included in the existing ACL just before accepting the current recipient. This mechanism is designed to provide local checks to be easily added to either configuration setup. Other than the delays, the checks in the prior ACLs could be implemented here. The same mechanism is used for the data command.
The following rules are implemented:
- Accept signed return path addresses when the sender is either empty or the targeted recipient. Other senders are handled by normal checks. Signing return paths has been described in another post;
- Flags bogus notifications for handling in the Data ACL. This requires callouts to work;
- Deny the recipient if the server is listed in the Spamhaus blacklist;
- Accept the message if is a notification; and
- Greylist the message if the server looks bogus. (Server fails rDSN and HELO validations and is not whitelisted. This could be more aggressive and greylist if either validation fails.) Greylisting is implemented using the MySQL method described on a variety of sites.
The checks applied to this point eliminate over 90% of our SPAM load. This occurs with little overhead and requires only a few DNS lookups. The DNS lookups end up locally cached for use by the Spam filter when the Data ACL is run.
DATA checks
The 40_local-config_check-data file local additions to the ACL for the DATA command. The data ACL is the last chance to reject SPAM. The following rules are implemented.
- Reject messages flagged as a bogus notification in the recipient ACL;
- Scan the message for malware and freeze if any is found;
- Accept messages from local senders and senders using the Submission protocol; and
- Invoke Spamassassin to check the message content for Spam.
To feed our database, we scan all Internet messages. Otherwise, we would skip the Spamassassin based on whitelists and other criteria. Once we have gotten a spam score from Spamassassin we run several rules. These include:
- Add a spam status header to the message;
- Accept the message if it is ham (Lowest spam scores.);
- Reject all spam from postmasters and mailer-daemons;
- Reject all Spam that exceeds a high limit; and
- Flag Spam in the subject header if we are going to accept it.
Notes
When testing new rules I use two techniques to prevent loss of email. During testing, I use two techniques. With both these techniques, I add hosts to a whitelist or a blacklist as appropriate. The techniques I use are:
- Defer rather than deny the message.
- Use
control = freeze
to prevent final delivery if the email is eventually accepted.
The logcheck
program checks Exim’s mainlog
file and reports lines of interest. Normal messages are excluded from the report, so only lines of interest are reported. All messages which are frozen or deferred are reported. This allows for timely follow-up.
My experience has shown that servers for legitimate bulk and automated email are often poorly configured. This prevents applying some rules I would prefer to implement. I do attempt to notify some operators, but in some cases, their configuration is so poor it is nearly impossible. (Thomas Cook Travel this means you, among others.)