Do I need email failover?

For transactional email (password resets, 2FA, order confirmations): yes, if email is core to your product. A few hours of downtime means users can't log in or receive orders. For marketing: usually not — campaigns can wait. For self-hosted email: strongly recommended (your server could go down).

How do I set up a backup SMTP?

Configure a second ESP with domain authentication (SPF, DKIM). Add both ESPs to your SPF record. In your application code: try primary SMTP, if connection fails → switch to backup. Or use n8n to monitor primary health and trigger backup activation.

Won't a backup ESP hurt my deliverability?

Not if properly warmed. The backup needs occasional traffic to maintain reputation. Send a small percentage (5-10%) of routine traffic through the backup so it stays warm. A completely cold backup IP won't deliver well in an emergency.

What about DNS-level failover for receiving?

For incoming email: use multiple MX records with different priorities. Primary MX: your mail server. Secondary MX: a backup provider that queues email. If primary is down, email routes to secondary and gets delivered when primary recovers.

How do I keep the backup warm?

Route 5-10% of your normal sending volume through the backup provider continuously. This maintains a sending reputation on the backup IPs. Use n8n to distribute: 90% through primary, 10% through backup. In failover: shift to 100% backup.

Email Failover and Redundancy Architecture

Failover Architecture

For Transactional Email (Critical)

Application
├── Primary: Postmark (90% of traffic)
│   └── Health check: n8n tests SMTP every 5 minutes
├── Backup: SendGrid (10% of traffic — keeps warm)
│   └── Activates when primary health check fails
└── [SPF](/email-infrastructure/spf-record-optimization) includes both providers

Implementation:

async function sendEmail(to, subject, body) {
  try {
    // Try primary (Postmark)
    await postmark.sendEmail({ to, subject, body })
  } catch (error) {
    // Primary failed — use backup (SendGrid)
    console.error('Primary SMTP failed, using backup:', error)
    await sendgrid.send({ to, subject, text: body })
    // Alert the team
    await alertSlack('Primary SMTP failed — backup active')
  }
}

For Marketing Email (Recommended but Not Critical)

Marketing campaigns can tolerate brief delays. Failover is still good practice — but first ensure proper stream separation:

Primary ESP: Klaviyo/ActiveCampaign for campaigns
Backup plan: If primary is down, postpone send by a few hours
For time-critical sends (flash sales): Pre-configure backup ESP credentials

For Self-Hosted SMTP (Essential)

Self-hosted servers have single points of failure. Build redundancy:

Sending failover: Configure your application to fall back to Mailgun/SES if self-hosted SMTP is unreachable
Receiving failover: Secondary MX pointing to a backup service that queues email
Monitoring: UptimeRobot (free) checks every 5 minutes

DNS for Redundancy

Outgoing (SPF)

Include both primary and backup providers:

v=spf1 include:spf.mtasv.net include:sendgrid.net -all

Both providers authorized even when backup is inactive.

Incoming (MX)

yourdomain.com  MX  10  primary.mail.yourdomain.com
yourdomain.com  MX  20  backup.mail.provider.com

Secondary MX accepts email when primary is unreachable. Queues and forwards when primary recovers.

Keeping Backup Warm

A backup SMTP provider with no recent sending history has no reputation. When you failover, delivery is poor because the IPs are cold.

Solution: Route 5-10% of traffic through backup continuously:

n8n workflow: randomly route 10% of sends to backup provider
Both providers stay warm with active reputation
In failover: shift to 100% backup (already warm)

Health Monitoring

n8n Automated Health Check

Cron (every 5 minutes)
  → SMTP Test: connect to primary on port 587
  → IF connection fails:
    → Attempt 3 retries over 2 minutes
    → IF still failing:
      → Switch application to backup SMTP
      → Send alert to Slack/email
      → Log timestamp for tracking
  → IF connection succeeds after previous failure:
    → Switch back to primary
    → Send "recovered" alert

What to Monitor

SMTP port reachable (port 25/587)
SMTP authentication succeeds
Test email delivers successfully
Response time within acceptable range (<5 seconds)

Practitioner note: Failover is most critical for SaaS products. A client's Postmark outage (rare but it happens) would have blocked all password resets if they didn't have SendGrid as a warm backup. 15 minutes of outage, zero user impact because failover activated automatically. The $20/month cost of keeping a backup warm is insurance.

Practitioner note: For self-hosted email servers: always have a hosted backup. Your VPS can go down (hosting issues, network problems, maintenance). A Mailgun or SendGrid account on pay-per-email ($0-10/month when idle) activates as failover. No monthly cost when things are working, immediate coverage when they're not.

If you need redundant email architecture designed, schedule a consultation.

Sources

Postmark: Status Page
SendGrid: Status Page

v1.0 · March 2026