Email Queue Troubleshooting Tutorial (2026): Fix a Stuck Postfix Mail Queue on a Hosting VPS

A Postfix queue that grows faster than it drains will quietly break signups, password resets, and contact forms. This email queue troubleshooting tutorial shows how to find why Postfix is deferring or bouncing mail on an Ubuntu VPS. It also shows how to fix it without making things worse.

You’re not here to “reinstall Postfix.” You want a repeatable workflow. Identify the failure mode. Confirm it in logs. Change one thing. Then watch the queue recover.

What you’ll troubleshoot (and what you’ll avoid)

This guide focuses on stuck, deferred, or fast-growing Postfix queues on a hosting VPS or dedicated server. Postfix is already installed, and mail used to send normally.

Covered: DNS problems, remote throttling, TLS issues, authentication/relay errors, local resource limits, and safe queue cleanup.
Not covered: full SMTP server setup from scratch (save that for a dedicated build guide).

If you need a full deliverability baseline (SPF/DKIM/DMARC plus reputation checks), see HostMyCode’s email deliverability troubleshooting tutorial and the SPF/DKIM/DMARC setup guide.

Prerequisites for this email queue troubleshooting tutorial

You’ll need shell access as a sudo-capable user. Examples target Ubuntu 24.04 LTS (common for hosting in 2026). They also translate cleanly to Debian 12+.

Postfix installed and running (systemd)
Access to /var/log/mail.log (or journald)
Optional but helpful: a domain you control and its DNS access

Step 1: Confirm the queue is actually stuck (not just busy)

Start by checking the queue’s size and pattern. One bad destination can block thousands of messages. Other mail may still deliver.

sudo postqueue -p

Look for these signals:

Many entries marked deferred with the same error text
A heavy concentration to one recipient domain (for example, gmail.com) with repeated temporary failures
The same queue IDs showing up again with newer timestamps (retries)

Quick counts help you gauge severity:

sudo postqueue -p | tail -n 1
sudo postqueue -p | grep -c "deferred"

If the queue is large and still growing, stop the source of new mail first. Pause newsletter jobs, cron emailers, and any misbehaving apps. Otherwise, you’ll debug while the server keeps getting flooded.

Step 2: Pull one queue ID and read the real error

Pick a queue ID from postqueue -p. Then inspect the message and its recent log trail. Example queue ID: 3F2A21C0A1.

QUEUE_ID="3F2A21C0A1"
sudo postcat -q "$QUEUE_ID" | sed -n '1,60p'

Next, pull matching log lines. On Ubuntu with rsyslog this is typically /var/log/mail.log:

sudo grep -F "$QUEUE_ID" /var/log/mail.log | tail -n 50

If your system logs to journald instead:

sudo journalctl -u postfix --since "2 hours ago" | grep -F "$QUEUE_ID" | tail -n 50

Find the final status line. It will include status=deferred or status=bounced. That line usually includes the key detail: why it failed (DNS, TLS, policy, timeout, auth).

Step 3: Classify the failure mode (so you fix the right thing)

Most queue explosions come from a small set of causes. Match what you saw in logs to a bucket. Then run the next check.

Error snippet in logs/queue	What it usually means	Your next check
connect to ... timed out / Connection refused	Network path or remote server issue	Test outbound 25/465/587 and check provider blocks
Host or domain name not found. Name service error	DNS resolution failing	Check resolver + domain MX/A records
certificate verification failed / TLS handshake failure	TLS negotiation/certs mismatch	Test with openssl s_client; review tls policy
Relay access denied / Authentication required	Misconfigured relayhost or SASL auth	Check main.cf relay settings and secrets
451 4.7.0 Try again later / rate limited	Remote throttling (common on large providers)	Concurrency + backoff tuning
Mailbox full / 552 5.2.2	Recipient problem, not your server	Decide whether to keep retrying or bounce

Step 4: Check outbound connectivity (ports and provider blocks)

If you’re seeing timeouts, verify the basics. Can this VPS reach remote MX hosts on TCP/25?

Many providers block outbound port 25 on new IPs to reduce abuse. When that happens, queues grow fast.

Test TCP/25 quickly. Use the destination from your logs, or a known MX:

sudo apt-get update
sudo apt-get install -y netcat-openbsd dnsutils

# Example: check Gmail MX (will vary)
dig +short mx gmail.com

# Test one of the MX hosts (replace host)
nc -vz aspmx.l.google.com 25

If TCP/25 is blocked, your options are limited and clear:

Ask your provider to remove the block (common for verified business use).
Send via a smarthost on 587 with auth (still Postfix, different route).

If you want a VPS designed for predictable hosting operations, start with a HostMyCode VPS (or choose managed VPS hosting if you prefer us to handle the mail stack checks and hardening).

Step 5: Fix DNS resolution issues (the silent queue killer)

When DNS is unreliable, Postfix can’t resolve MX targets. It starts deferring mail. The server looks “up,” but the queue keeps stacking.

Check the current resolver configuration. Then test a couple lookups:

resolvectl status | sed -n '1,120p'

dig +time=2 +tries=1 mx gmail.com

dig +time=2 +tries=1 A example.com

If queries time out:

Confirm your firewall allows outbound UDP/TCP 53.
Switch to stable resolvers (your provider DNS or well-known public resolvers).

On Ubuntu with systemd-resolved, set DNS servers in:

/etc/systemd/resolved.conf

Example (adjust to your policy):

[Resolve]
DNS=1.1.1.1 9.9.9.9
FallbackDNS=8.8.8.8

Then restart:

sudo systemctl restart systemd-resolved

If you’re also tightening domain-side DNS (SPF, DKIM, DMARC, rDNS) to reduce deferrals and blocks, pair this with HostMyCode’s reverse DNS setup guide. For managing zones, you can keep domains and DNS together via HostMyCode domains.

Step 6: Identify TLS failures and fix the common misconfig

TLS problems often show up as “handshake failure” or “certificate verify failed” in mail.log. The fix depends on direction.

Postfix can be the sending side (smtp client) or the receiving side (smtp server). Queue buildup is usually the sending side failing to negotiate with remote MX servers.

Start by dumping Postfix’s TLS-related settings:

sudo postconf | egrep -i '^(smtp_tls|smtpd_tls|tls_|smtp_tls_policy_maps)'

Two common gotchas:

If you use smtp_tls_policy_maps, one bad policy entry can force incompatible TLS on a destination and defer mail.
If the system CA bundle is missing or outdated, Postfix can’t validate remote certificates.

Make sure CA certificates are installed and updated:

sudo apt-get install -y ca-certificates
sudo update-ca-certificates

To reproduce the handshake against a remote MX (example only):

openssl s_client -starttls smtp -crlf -connect aspmx.l.google.com:25 -servername aspmx.l.google.com < /dev/null

If that fails from the VPS, treat it as a host-level issue first. It’s usually DNS, a network block, or traffic interference. It’s not usually a Postfix-only setting.

Step 7: Fix relay/auth errors (common after credential rotation)

If you see Relay access denied or Authentication required, Postfix is likely using a relayhost. It can’t authenticate to that relay.

Review these settings in /etc/postfix/main.cf:

relayhost = [smtp.relay.example]:587
smtp_sasl_auth_enable = yes
smtp_sasl_password_maps = hash:/etc/postfix/sasl_passwd
smtp_sasl_security_options = noanonymous
smtp_tls_security_level = encrypt (often desired on 587)

Then confirm the secrets file exists and is locked down:

sudo ls -l /etc/postfix/sasl_passwd /etc/postfix/sasl_passwd.db 2>/dev/null
sudo stat -c "%a %U:%G %n" /etc/postfix/sasl_passwd 2>/dev/null

Recommended permissions: 0600 root:root. After editing /etc/postfix/sasl_passwd, rebuild the hash and reload:

sudo postmap /etc/postfix/sasl_passwd
sudo systemctl reload postfix

If you rotated credentials, queued messages will retry using the new config after reload. In most cases, you don’t need to delete anything.

Only delete if the queue is full of junk.

Step 8: Handle rate limiting and deferrals without melting your IP reputation

Large providers often respond with temporary deferrals like 451 4.7.0. The wrong reaction is “send faster.” Higher concurrency can trigger more throttling and prolong the incident.

Check your current limits:

sudo postconf | egrep -i 'default_process_limit|smtp_destination_concurrency_limit|smtp_destination_rate_delay|smtp_destination_recipient_limit'

For a typical small-to-mid hosting VPS sending transactional mail, conservative values usually behave better:

# /etc/postfix/main.cf (examples)
smtp_destination_concurrency_limit = 5
smtp_destination_rate_delay = 1s
smtp_destination_recipient_limit = 20

Reload Postfix after changes:

sudo systemctl reload postfix

If you’re sending bulk email, don’t mix it with transactional mail on the same IP. Split those roles across IPs or servers.

Step 9: Find the “one domain” clogging your queue

Many queues look “stuck” because one destination domain dominates the backlog. Find that domain first. It’s often the fastest win.

This counts recipient domains from the queue (best on moderate queue sizes):

sudo postqueue -p \
| awk 'BEGIN{RS=""} /@/ {for(i=1;i<=NF;i++) if($i ~ /@/) {split($i,a,"@"); gsub(/[>,<]/,"",a[2]); print a[2]}}' \
| sort | uniq -c | sort -nr | head

If the top result is clearly wrong (typo domain, dead campaign list), stop generating those messages at the source. Then decide whether to remove only those queued messages (next step) or let them expire.

Step 10: Safe queue cleanup (surgical, not scorched earth)

Wiping the entire queue can feel satisfying. It’s sometimes necessary, but it’s rarely the best first move.

Fix the root cause first. Retry delivery. Then delete only what you’re sure you don’t want.

Re-queue and retry first

# Re-queue everything (useful after fixing DNS/TLS/relay)
sudo postsuper -r ALL

# Force an immediate attempt
sudo postqueue -f

Delete only messages from/to a specific domain

Postfix doesn’t provide a perfect “delete by domain” command. Scripting it is straightforward. Example: delete all queued mail destined for bad-domain.example.

DOMAIN="bad-domain.example"

sudo postqueue -p \
| awk 'BEGIN{RS=""} {qid=$1; for(i=1;i<=NF;i++) if($i ~ /@/ && $i ~ DOMAIN) print qid}' DOMAIN="$DOMAIN" \
| tr -d '*!' \
| sort -u \
| while read -r qid; do
    echo "Deleting $qid";
    sudo postsuper -d "$qid";
  done

Before you delete, verify the queue IDs match what you intend. Run postcat -q on a few IDs as a sanity check.

Delete all mail in the deferred queue (use with care)

If deferred mail built up during an outage you’ve fixed, don’t delete it. Requeue it.

If the deferred queue is mostly spam or a runaway script you must purge, wiping deferred items can be appropriate:

sudo postsuper -d ALL deferred

Step 11: Confirm recovery with logs and metrics you can trust

After changes, don’t guess. Confirm the queue drains and new deliveries succeed.

Queue shrinking: postqueue -p every 2–5 minutes
Clean delivery lines: status=sent in /var/log/mail.log
No repeating deferrals with the same error

Tail the logs while you flush:

sudo tail -f /var/log/mail.log | egrep -i 'status=sent|status=deferred|warning|error'

For ongoing visibility, set up daily summaries. HostMyCode’s Logwatch setup tutorial is a good low-noise option for mail and auth services.

Step 12: Prevent the next queue incident (practical hosting defaults)

You don’t need an enterprise mail platform to avoid late-night queue emergencies. You need a few unglamorous controls that keep failures small and visible.

Quick hardening checklist

Lock down SSH and stop brute-force access before attackers can abuse mail: follow the SSH lockdown tutorial.
Rate-limit abusive endpoints (e.g., WordPress login or contact forms) so they can’t spam mail sends: see the Nginx rate limiting tutorial.
Backups you can restore. If a compromise forces a rebuild, you want clean restores and quick DNS/SSL recovery: use the VPS backup setup guide.
Separate bulk from transactional by design (different IP/server), especially for WooCommerce and membership sites.

Simple monitoring: alert on queue growth

A queue of 500 messages is harmless on one server and a crisis on another. Set a threshold that matches your normal volume. Then alert on growth.

Create a small script:

sudo tee /usr/local/bin/check-postfix-queue.sh > /dev/null <<'EOF'
#!/bin/sh
COUNT=$(postqueue -p | grep -c '^[A-F0-9]')
WARN=${1:-200}
CRIT=${2:-500}

if [ "$COUNT" -ge "$CRIT" ]; then
  echo "CRITICAL: Postfix queue $COUNT"
  exit 2
elif [ "$COUNT" -ge "$WARN" ]; then
  echo "WARNING: Postfix queue $COUNT"
  exit 1
else
  echo "OK: Postfix queue $COUNT"
  exit 0
fi
EOF

sudo chmod +x /usr/local/bin/check-postfix-queue.sh

Run it from cron and mail the output to an admin address, or wire it into your monitoring system.

If you want a monitoring baseline on Ubuntu, use the HostMyCode guide: VPS monitoring tutorial.

Summary: your repeatable workflow

Confirm the queue shape (postqueue -p) and stop new floods.
Pick a queue ID, read the log error, and classify it (DNS, network, TLS, relay, throttling).
Fix one root cause, reload Postfix, then postsuper -r ALL and postqueue -f.
Clean up surgically if you must—avoid wiping the whole queue unless it’s clearly junk.
Add queue alerts so you catch the next issue early.

If you’re tired of chasing mail delivery issues on an underpowered box, move to a hosting-grade server. A HostMyCode VPS gives you predictable resources, and managed VPS hosting is there when you want the queue, DNS, and security checks handled by a team.

If your queue problems keep returning, treat that as a signal—not bad luck. You likely need clearer separation of roles, better monitoring, or more headroom. Start with a HostMyCode VPS for full control, or choose managed VPS hosting if you want help with Postfix diagnostics, DNS alignment, and preventing repeat incidents.

FAQ

Should I delete the entire Postfix queue to “fix” delivery?

Only if you’re sure the queue is junk (spam from a compromise or a runaway script). If the cause is DNS/TLS/relay, fix it and requeue with postsuper -r ALL instead.

Why does Postfix keep retrying the same deferred messages?

That’s expected behavior. Postfix uses backoff and retry windows. If the error is permanent (bad domain, mailbox doesn’t exist), bounce it or delete only those specific messages so retries stop.

What’s the fastest way to find what’s clogging the queue?

Scan /var/log/mail.log for repeated error text, then count recipient domains from postqueue -p. One destination domain (or one broken list) often accounts for most deferrals.

Can a firewall cause a stuck mail queue?

Yes. Outbound TCP/25 and DNS (UDP/TCP 53) are common failure points. If you tightened egress rules, confirm mail and DNS are explicitly allowed.

How do I stop WordPress from generating mail floods?

Fix the trigger first (brute-force login, contact form abuse, or a plugin loop). Rate-limit the endpoints at Nginx, and add queue alerts so spikes get caught early.