WIP!
Most of this site is incomplete, and the current state is available as an open draft. Most of the text here is likely incomplete, misinformed, or just plain wrong. I'm looking for feedback on my website, so that I can:
- Fill in what I'm missing
- Take out what's unnecessary
- Figure out my target audience
- Find the right way to structure the site
- Filter out any errors
To anyone who wants to send me feedback, thank you, and shoot me an email!
Very rarely I have uptime issues with my mail server. SMTP is a fairy resilient protocol and you should generally receive your mail even if your server has been down for up to a day.
However, longer periods may risk your mail expiring and being sent back to sender, or simply expired. For the three years I’ve hosted an email server (both Postfix and OpenSMTPD), it had a service outage only one time, for multiple days when a badly configured nginx server clogged my machine with error logs. After purging the logs, fixing the config and upsizing the disk space, I found old mail trickle in for the next couple weeks.
I could have hosted a dedicated backup server that would cache envelopes and relay them to the main server. Although I would find the ~$5/mo cost a bit much for a personal site, it would be profitable for machines with a much larger userbase.
smtpd’s config action relay backup
transforms the daemon from a standard
MX to a backup mail exchanger, relaying
envelopes to any sibling exchanger with a higher priority.
This tutorial will show how to set up one or more backup MX’s that will queue up messages in case the main MX catches fire.
Preparation
I assume you followed through the main tutorial or already have an MX somewhere else. A couple notes about this setup:
- The backup relay’s sole job is to collect mail from the world while the main server is down.
- DKIM signing, anti-spam, and mail submissions will still only be handled by the main server.
This is mainly to make the backup server’s job much simpler. Without owning a mailbox, it doesn’t have to run dovecot, rspamd, or redis. It doesn’t have to manage DKIM or DH keys, or sync mailboxes or credentials with the main server.
This means hosting a backup server has less moving parts and can run on a slimmer machine than the main server. Conversely, it also means this won’t protect against data loss from the main server burning down, and only exists to preserve uptime.
For this tutorial, we’ll assume we’ve already set up our main server
mail.example.org
, and want to set up a backup server mx2.example.org
.
Setup DNS
Assuming you’ve already set up DNS for the main server, there’s only one MX record you’d need to add. I’ll pull over the MX record from the main tutorial for comparison:
example.org. MX "mail.example.org" 10
example.org. MX "mx2.example.org" 100
Notice that the record for mx2.example.org
has a higher priority number? In
MX records, a lower priority value is prioritized first, so mail exchangers
will attempt to send envelopes to the main MX before falling back to mx2
.
Setup SSL
Like the main server, set up acme-client and httpd to accept challenges from Let’s Encrypt and renew the SSL key.
For /etc/acme-client.conf
:
authority letsencrypt {
api url "https://acme-v02.api.letsencrypt.org/directory"
account key "/etc/acme/letsencrypt-privkey.pem"
}
domain mx2.example.org {
domain key "/etc/ssl/private/mx2.example.org.key"
domain certificate "/etc/ssl/mx2.example.org.cert"
domain full chain certificate "/etc/ssl/mx2.example.org.fullchain.pem"
sign with letsencrypt
}
For /etc/httpd.conf
:
server "mx2.example.org" {
listen on egress port http
block drop
location found "/.well-known/acme-challenge/*" {
root "/acme"
request strip 2
pass
}
}
Check the config, enable httpd, and generate the cert:
$ doas httpd -nf /etc/httpd.conf
configuration OK
$ doas acme-client -nf /etc/acme-client.conf
$ doas rcctl enable httpd
$ doas rcctl restart httpd
httpd(ok)
$ acme-client -v mx2.example.org
Setup Smtpd
For /etc/mail/smtpd.conf
:
pki mx2.example.org cert "/etc/ssl/mx2.example.org.fullchain.pem"
pki mx2.example.org key "/etc/ssl/private/mx2.example.org.key"
table aliases file:/etc/mail/aliases
table domains { example.org }
# Just like before, we check for and filter out dyndns, rdns, and fcrdns.
# However for simplicity's sake, we leave anti-spam logic by rspamd for
# the main server.
filter "no_dyndns" phase connect match rdns regex { '.*\.dyn\..*', '.*\.dsl\..*' } \
disconnect "550 no residential connections"
filter "no_rdns" phase connect match !rdns \
disconnect "550 mailserver failed rDNS check"
filter "no_fcrdns" phase connect match !fcrdns \
disconnect "550 mailserver failed FCrDNS check"
filter incoming chain { "no_dyndns", "no_rdns", "no_fcrdns" }
# ---Incoming Mail---
listen on egress tls pki mx2.example.org auth-optional filter incoming
action "local_mail" relay backup
match from any for domain <domains> action "local_mail"
# ---Outgoing Mail---
# We'll still keep smtpd's default behavor of unix users
# delivering mail between other users
listen on socket
action "internal_mail" mbox alias <aliases>
match from local for local action "internal_mail"
Check the config and restart smtpd:
$ doas smtpd -nf /var/mail/smtpd.conf
configuration OK
$ doas rcctl restart smtpd
smtpd(ok)
smtpd(ok)
Open SMTP on the firewall
If you disable incoming ports on pf by default, open up SMTP. Add this line in /etc/pf.conf
:
pass in on egress proto tcp to port smtp
Check the config and apply changes:
$ doas pfctl -nf /etc/pf.conf
$ doas pfctl -f /etc/pf.conf
Test
I tested this setup with a fresh domain running mail for a new domain (I’ll be
using @example.org
as a stand-in), and sent email from @websteading.net
to
see where the envelope gets sent to.
To test the backup relay is working, I ran smtpctl monitor
on
mail.example.org
, mx2.example.org
, and mail.websteading.net
. Since I’ve
never used the new domain, and have extremely low traffic on websteading.net,
the monitor usually looks like this:
--- client --- -- envelope -- ---- relay/delivery --- ------- misc -------
curr conn disc curr enq deq ok tmpfail prmfail loop expire remove bounce
0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0
...
Having no clients connected, and no envelopes in queue, and no activity each poll.
First Experiment
To start, I have all servers running and I sent an email to
test@example.org
. The monitor for mail.example.org
reported a new
connection delivering the envelope:
--- client --- -- envelope -- ---- relay/delivery --- ------- misc -------
curr conn disc curr enq deq ok tmpfail prmfail loop expire remove bounce
...
0 0 0 0 0 0 0 0 0 0 0 0 0
1 1 0 0 0 0 0 0 0 0 0 0 0
1 0 0 0 1 1 1 0 0 0 0 0 0
1 0 0 0 0 0 0 0 0 0 0 0 0
1 0 0 0 0 0 0 0 0 0 0 0 0
0 0 1 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0
...
…and mx2.example.org
showing no traffic. This is expected, since the MX
record pointing to mail.example.org
has a higher priority than the backup, so
with both servers on, the former gets precedence.
Second Experiment
Then, I stopped accepting incoming sessions from mail.example.org
:
$ doas smtpctl pause smtp
command succeeded
I sent a new email, and initially there were no monitor changes for either
mail.example.org
or mx2.example.org
. Instead, mail.websteading.net
kept
the letter in queue:
--- client --- -- envelope -- ---- relay/delivery --- ------- misc -------
curr conn disc curr enq deq ok tmpfail prmfail loop expire remove bounce
0 0 0 1 0 0 0 0 0 0 0 0 0
0 0 0 1 0 0 0 0 0 0 0 0 0
0 0 0 1 0 0 0 0 0 0 0 0 0
0 0 0 1 0 0 0 0 0 0 0 0 0
0 0 0 1 0 0 0 0 0 0 0 0 0
And I was able to see a summary of the letter via smtpctl show queue
:
a1e24bba4ee4092b|inet4|mta|auth|test23442@websteading.net|someone@example.org|someone@example.org|209094426|209094426|209094426|0|inflight|29
Showing that the letter has about a half minute wait time left until smtpd tries again (the last value in the row).
About a minute later, I see the timer reset:
a1e24bba4ee4092b|inet4|mta|auth|test23442@websteading.net|someone@example.org|someone@example.org|209094426|209094426|209094426|0|inflight|93
Eventually, the monitor changes in mail.websteading.net
:
--- client --- -- envelope -- ---- relay/delivery --- ------- misc -------
curr conn disc curr enq deq ok tmpfail prmfail loop expire remove bounce
0 0 0 1 0 0 0 0 0 0 0 0 0
0 0 0 1 0 0 0 0 0 0 0 0 0
0 0 0 1 0 0 0 0 0 0 0 0 0
0 0 0 1 0 0 0 0 0 0 0 0 0
0 0 0 1 0 0 0 0 0 0 0 0 0
0 0 0 1 0 0 0 0 0 0 0 0 0
0 0 0 0 0 1 1 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0
...
And in mx2.example.org
:
--- client --- -- envelope -- ---- relay/delivery --- ------- misc -------
curr conn disc curr enq deq ok tmpfail prmfail loop expire remove bounce
1 1 0 0 0 0 0 0 0 0 0 0 0
1 0 0 1 1 0 0 0 0 0 0 0 0
1 0 0 1 0 0 0 0 0 0 0 0 0
1 0 0 1 0 0 0 0 0 0 0 0 0
1 0 0 1 0 0 0 0 0 0 0 0 0
1 0 0 1 0 0 0 0 0 0 0 0 0
1 0 0 1 0 0 0 0 0 0 0 0 0
I then resumed normal operations in mail.example.org
:
$ doas smtpctl enable smtp
command succeeded
And then in the main mail server, the envelope is finally delivered:
--- client --- -- envelope -- ---- relay/delivery --- ------- misc -------
curr conn disc curr enq deq ok tmpfail prmfail loop expire remove bounce
0 0 0 0 0 0 0 0 0 0 0 0 0
1 2 1 0 0 0 0 0 0 0 0 0 0
1 0 0 0 0 0 0 0 0 0 0 0 0
1 0 0 0 1 1 1 0 0 0 0 0 0
1 0 0 0 0 0 0 0 0 0 0 0 0
1 0 0 0 0 0 0 0 0 0 0 0 0
1 0 0 0 0 0 0 0 0 0 0 0 0
1 0 0 0 0 0 0 0 0 0 0 0 0
0 0 1 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0
All seems well. After a mail server made multiple attempts to connect to the main MX, it deferred to the backup server instead. The backup server then queued the letter until the main server went back online. Overall, this looks like a success!