Delivery Errors and Retrying
A MTA has to be prepared to hold messages due to hosts being down or temporary unavailable, some rules are required for deciding how often the retrying is to occur and when to give up.
Delivering message costs resources, so it is a good idea not to retry too often. Exim uses host-based retying (actually it uses the IP address not the hostname), if delivery to a host fails temporarily, all messages that are routed to that host are delayed until its next retry time arrives. Information about temporary delivery failures is kept in a hints database called retry, you can read this database by using the utilities exinext or exim_dumpdb. The information in the database includes details of the error, the time of the first failure, the time of the most recent failure, and the time before which it is not reasonable to try again.
There can be a number of hosts errors
When a permanent SMTP error code (5xx) is given at the start of a connection all the addresses that are routed to the host are failed and returned to the sender in a bounce message. The other kind are errors that are temporary cause all messages to that host to be deferred and not retried again until after its retry time has passed.
Two of the most common local errors are
The retry times are the same as for the remote delivery errors above, but retry delays apply only to deliveries in the queue runs.
Common routing errors can be
Retry processing applies to routing an address as well as to transporting a message, but only for delivery processes started in the queue runs. There is no distinction between routing and transporting a message.
The retry rules are contained in a separate area on the configuration file, it starts at the line "begin retry". Each rule occupies one line and consists of three parts
retry rule example | # a pattern; an error name; list of retry parameters domain error retries Note: the above retry rule is as follows, this is a catch all rule note the first two * * = for all domains |
Exim searches the rules in order until one matches, there is normally a catch all rule (see above). If a rule cannot be found then the temporary error is converted to a permanent error and the address is bounced after the first delivery attempt. Also the times are used in turn once all the times have been used then again the error is converted to a permanent error and the message is bounced. There is a option called retry_interval_max (defaulted to 24) which makes sure that a message tries at least once a day, this option prevents you from generating enormously long retry intervals.
The domain description can use wildcards i.e *.datadisk.co.uk, you can also use expressions and several forms of lookup.
There are a number of error field values that you can use
Error | Meaning |
auth_failed | Authentication failed |
data_4xx | A 4xx error was received for a DATA command |
lost_connection | The connection closed unexpectedly |
mail_4xx | A 4xx error was received for a MAIL command |
quota | Quota exceeded in local delivery |
quota_<time> | Quota exceeded in local delivery, and the mailbox has not been read for <time> |
rcpt_4xx | A 4xx error was received for a RCPT command |
refused_MX | Connection refused: host obtained from an MX record |
refused_A | Connection refused: host not obtained from an MX record |
refused | any connection refusal |
timeout_connect_MX | Connection timed out: host obtained from an MX record |
timeout_connect_A | Connection timed out: host not obtained from an MX record |
timeout_connect | Any timeout connection |
timeout_MX | Any timeout for a host obtained from an MX record |
timeout_A | Any timeout for a host not obtained from an MX record |
timeout | Any timeout |
tls_required | A TLS session could not be setup when required |
The times specified are hints not promises, Exim will try its best to honor the times but they will not be exact times. Also make sure that if your queue runner process only runs every 15mins it does not make much sense in specifying a retry time of 5mins, what i am trying to say is don't make a retry rule less then the queue runner time, it don't make much sense.
More retry rule examples | alice@wonderland.example quota F,7,3H Note: I will leave you to figure these out |
Certain messages could fail for a long period, this could be because the message has multiple choices to deliver to (multiple MX records for the same domain), it is possible to have different rules for domains for example
message with different MX records | # suppose the domain tweedledum.example is routed by MX records to both tweedledum.example and # tweedledee.example tweedledum.example * F,1d,30m; ## the first route for a message as per the MX record tweedledee.example * F,5d,2h; ## the second route for a message as per the MX record Note: a message may have two routes to deliver (as above), the address will only timeout when the all routes times have passed |
I have not documented dial-up connections, you may want to pop over the official Exim Web Site to get more information on dial-ups.