VPS Disaster Recovery Tutorial (2026): Rebuild a Failed Ubuntu Web Server Fast with IaC-lite, Backups, DNS & SSL

A VPS failure rarely looks dramatic. It’s usually a dead disk, a bad update, a lockout, or a provider issue that turns your site into a 502. This VPS disaster recovery tutorial gives you a practical, repeatable way to rebuild an Ubuntu web server fast, restore what matters, and bring traffic back with clean DNS and SSL handling.

The goal isn’t an enterprise DR program. It’s a runbook you can follow at 2 a.m. on a fresh VPS or dedicated server.

No guessing. No “what did I do last time?”

VPS disaster recovery tutorial: what you’re rebuilding (and what you’re not)

Before you spin up a new server, define what “recovered” means for you. In most incidents, success looks like this: the website loads, email keeps delivering, and you still control admin access.

OS + access: Ubuntu Server (commonly 22.04 LTS or 24.04 LTS), SSH keys, firewall rules.
Web stack: Nginx or Apache, PHP-FPM for WordPress/PHP sites, required extensions, system services.
Application data: site files, uploads, configs (.env/wp-config.php), scheduled jobs, and any secrets.
Database: restore from dump or files (keep this guide DB-light; the steps below cover typical WordPress MySQL restores without deep tuning).
DNS + SSL: bring the domain back online safely, re-issue certificates, verify no mixed-content or HSTS traps.

If you’re hosting client sites or revenue traffic, you’ll usually recover faster on a clean new instance.

Trying to “repair” a corrupted box often burns time and adds risk.

That’s where a solid VPS provider matters: fast provisioning, stable networking, and snapshots you can rely on. If you want predictable rebuild times, start with HostMyCode VPS so you can spin up a fresh node in minutes.

Prep work you should do before the next incident (15 minutes that saves hours)

You can run this tutorial with zero prep. You’ll just spend more time searching for details and recreating configs by hand.

If you can spare 15 minutes today, do this now.

Lower DNS TTL for critical records (A/AAAA for your site, MX if you self-host mail). Use 300 seconds for most small sites.
Store your “golden” configs off-server: Nginx vhosts, systemd unit files, cron entries, firewall rules. A private Git repo is fine.
Keep a credential inventory: domain registrar login, DNS provider, SSH key locations, database credentials, SMTP relay creds.
Document your stack (just one page): OS version, web server, PHP version, docroot, and where logs live.

For a separate, deep backup implementation, see HostMyCode’s VPS backup setup guide tutorial. In this DR runbook, we’ll stay focused on execution: rebuild, restore, validate, cut over.

Step 1: Provision a clean replacement VPS (or standby) and lock down access

Create a new Ubuntu VPS in the same region as your users. If you run multiple sites, size it for peak traffic.

You don’t want to debug performance while the box is swapping.

Single WordPress site: 2 vCPU / 2–4 GB RAM is usually workable for recovery.
Multiple sites or WooCommerce: 4 vCPU / 8 GB RAM reduces “death by swapping” during restore.

If you don’t want to manage OS security and updates yourself during an incident, consider managed VPS hosting.

During DR, time is your most expensive resource.

Initial SSH hardening (do this before you install the stack)

From your workstation, connect as root (or the default admin user, depending on your image). Create a non-root sudo user.

Then switch to key-only SSH before you do anything else.

# create user
adduser deploy
usermod -aG sudo deploy

# copy your SSH key (from your local machine)
ssh-copy-id deploy@NEW_SERVER_IP

Edit SSH config:

sudo nano /etc/ssh/sshd_config

Recommended minimums:

PermitRootLogin no
PasswordAuthentication no
PubkeyAuthentication yes

Restart SSH safely:

sudo sshd -t && sudo systemctl restart ssh

If you want a tighter SSH posture (allowlists, 2FA, audit trails), follow the more detailed SSH lockdown tutorial. Then return here.

Step 2: Apply baseline firewall rules (UFW) and stop obvious brute-force noise

During DR you’ll install packages, restart services, and open ports. Keep rules simple at first.

Allow SSH, HTTP, and HTTPS. Keep everything else closed until the site is stable.

sudo apt update
sudo apt install -y ufw

sudo ufw default deny incoming
sudo ufw default allow outgoing

sudo ufw allow OpenSSH
sudo ufw allow 80/tcp
sudo ufw allow 443/tcp

sudo ufw enable
sudo ufw status verbose

If your old server used special ports (custom SSH port, mail services, admin panels), don’t open them yet.

Recover the site first. Then expand access deliberately.

If you want to lock this down further, use the dedicated UFW firewall setup tutorial.

Optional but smart: install Fail2Ban early. It cuts log noise and slows credential stuffing while you’re busy restoring.

sudo apt install -y fail2ban
sudo systemctl enable --now fail2ban

For a WordPress-friendly jail set, use HostMyCode’s Fail2Ban setup tutorial.

Step 3: Rebuild the web stack on Ubuntu (Nginx + PHP-FPM example)

This runbook uses Nginx because it’s common on VPS hosting. It’s also predictable under load and easy to validate.

If you run Apache or LiteSpeed, the recovery pattern is the same. Install the stack, restore your vhost config, validate locally, then cut over.

Install Nginx and PHP-FPM. Adjust the PHP version to match your app.

In 2026, PHP 8.3/8.4 are common depending on distro repos and compatibility needs.

sudo apt update
sudo apt install -y nginx
sudo apt install -y php-fpm php-cli php-mysql php-curl php-xml php-mbstring php-zip php-gd

Check services:

systemctl status nginx --no-pager
systemctl status php*-fpm --no-pager

Create a clean site directory layout

Keep the layout boring and consistent. You’ll move faster.

Future restores also won’t become archaeology.

sudo mkdir -p /var/www/example.com/public
sudo chown -R deploy:deploy /var/www/example.com

Restore or recreate your Nginx server block

Create /etc/nginx/sites-available/example.com:

sudo nano /etc/nginx/sites-available/example.com

server {
    listen 80;
    server_name example.com www.example.com;

    root /var/www/example.com/public;
    index index.php index.html;

    access_log /var/log/nginx/example.com.access.log;
    error_log  /var/log/nginx/example.com.error.log;

    location / {
        try_files $uri $uri/ /index.php?$args;
    }

    location ~ \.php$ {
        include snippets/fastcgi-php.conf;
        fastcgi_pass unix:/run/php/php8.3-fpm.sock;
    }

    location ~* \.(jpg|jpeg|png|gif|ico|css|js|svg|woff2?)$ {
        expires 7d;
        add_header Cache-Control "public";
        try_files $uri =404;
    }
}

Enable the site and test syntax:

sudo ln -s /etc/nginx/sites-available/example.com /etc/nginx/sites-enabled/example.com
sudo nginx -t
sudo systemctl reload nginx

If you want performance tuning (buffers, gzip, caching hints), use the separate Nginx performance optimization tutorial after the site is back.

DR time is not tuning time.

Step 4: Restore your website files (choose the fastest safe method)

Pick the method that matches what you have on hand. The “best” option is the one you can finish quickly, without taking risks.

Option A: Restore from a tarball you already have

If you have an offsite archive (S3, storage box, another VPS), pull it down and extract.

# example: restore archive already copied to /home/deploy
cd /var/www/example.com
sudo tar -xzf /home/deploy/example.com-files.tar.gz -C /var/www/example.com/public
sudo chown -R deploy:deploy /var/www/example.com/public

Option B: Pull from your old server (if it’s degraded but reachable)

If SSH still works on the old box, use rsync. It’s fast, resumable, and easy to rerun.

rsync -az --delete deploy@OLD_SERVER_IP:/var/www/example.com/public/ /var/www/example.com/public/

Pitfall: don’t blindly copy cache directories or giant backup folders unless you actually need them.

For WordPress, excluding wp-content/cache usually saves time.

rsync -az --delete --exclude 'wp-content/cache/' deploy@OLD_SERVER_IP:/var/www/example.com/public/ /var/www/example.com/public/

Option C: Restore from Git (for apps that support it)

If your site code lives in Git, restore the application first. Then bring back uploads from backups.

cd /var/www/example.com/public
git clone git@your-git-host:org/repo.git .

Step 5: Restore the database (minimal, safe restore flow)

This section stays intentionally simple. Your first job is a working site.

Deep database tuning can wait until things are stable.

Install a database server only if your architecture requires it on this VPS. If you use a managed database or a separate DB node, point your app at it and skip ahead.

sudo apt install -y mysql-server
sudo systemctl enable --now mysql

Create the database and user (example values):

sudo mysql

CREATE DATABASE exampledb DEFAULT CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;
CREATE USER 'exampleuser'@'localhost' IDENTIFIED BY 'CHANGE_THIS_STRONG_PASSWORD';
GRANT ALL PRIVILEGES ON exampledb.* TO 'exampleuser'@'localhost';
FLUSH PRIVILEGES;
EXIT;

Restore from a dump:

mysql -u exampleuser -p exampledb < /home/deploy/exampledb.sql

Quick validation: count tables and confirm the data you care about is present (latest posts/orders, key users, etc.).

mysql -u exampleuser -p -e "USE exampledb; SHOW TABLES;" | head

Step 6: Fix configuration and secrets (the part people forget)

Most “restore failed” stories aren’t backup problems. They’re config mismatches.

Common culprits include a new server IP, a new DB password, missing salts, wrong permissions, or a stale environment file.

WordPress checklist

Verify wp-config.php DB name/user/password/host.
Confirm file ownership allows the web server to read, and your deploy user to manage.
If your domain changed temporarily, correct siteurl and home after cutover.

Permissions that usually work for a single-site VPS:

sudo find /var/www/example.com/public -type d -exec chmod 755 {} \;
sudo find /var/www/example.com/public -type f -exec chmod 644 {} \;

cron jobs and systemd timers

Bring back scheduled tasks next. These are easy to miss.

They’re also a common reason “everything looks fine” while backups, queues, or renewals quietly stop.

Check your user crontab:

crontab -l

If you stored cron definitions in your repo or notes, re-apply them now. For systemd timers, look in:

/etc/systemd/system/*.timer
/etc/systemd/system/*.service

Reload and enable as needed:

sudo systemctl daemon-reload
sudo systemctl enable --now your-job.timer

Step 7: Issue SSL certificates (Let’s Encrypt) and verify HTTPS behavior

Get HTTPS working before you swing DNS to the new server. Otherwise, you’ll field “your site is not secure” reports while chasing redirect issues.

Install Certbot:

sudo apt install -y certbot python3-certbot-nginx

Request certificates and let Certbot update your Nginx config:

sudo certbot --nginx -d example.com -d www.example.com

Test renewal:

sudo certbot renew --dry-run

If you want the full SSL walkthrough (including common Nginx pitfalls), use HostMyCode’s SSL certificate setup guide.

HSTS warning

If your old site used HSTS (Strict-Transport-Security), browsers will force HTTPS even if your new box isn’t ready.

That’s why SSL comes before DNS cutover in this runbook.

Step 8: DNS cutover without surprises (A record, AAAA, and TTL)

Now point the domain at the new server IP. If you lowered TTL earlier, most users will reach the new origin within minutes.

If you didn’t lower TTL, expect a longer tail while resolvers flush old answers.

Update A record to the new IPv4 address.
If you use IPv6, update AAAA too (or remove it temporarily to avoid split traffic).
Confirm no stale records still target the old server.

From your workstation, verify DNS resolution:

dig +short example.com A
dig +short example.com AAAA

Then verify HTTP routing to the new origin:

curl -I http://example.com
curl -I https://example.com

If you manage your domain and DNS in one place, cutovers are easier under pressure. Keep your domains organized with HostMyCode Domains so you’re not hunting across three dashboards during an outage.

Step 9: Post-restore validation (a short checklist that catches 90% of issues)

Do these checks before you call the incident “done.” They catch most problems that show up later.

Web and TLS checks

Status codes: homepage returns 200, login page returns 200/302 as expected.
TLS: certificate matches the domain, no chain errors.
Redirects: HTTP → HTTPS works once, not in a loop.
Static assets: CSS/JS loads with correct MIME types and no 404s.

Useful commands:

curl -I https://example.com
curl -I https://example.com/wp-login.php
sudo tail -n 100 /var/log/nginx/example.com.error.log

Application sanity checks

Log into admin and save a setting (confirms DB write access).
Upload an image (confirms filesystem permissions).
Trigger a password reset email (confirms outbound email path).

Resource checks (catch runaway PHP or swap thrash)

free -h
uptime
top -o %CPU

If you want alerts and a baseline dashboard after recovery, set up monitoring using VPS monitoring tutorial.

Step 10: Email continuity (what to do if your mail used the old server)

Plenty of “website outages” turn into email outages because the same VPS handled SMTP/IMAP.

If you self-host mail on the same node, be careful with DNS and reputation-sensitive settings.

MX records: confirm they point where you expect.
SPF/DKIM/DMARC: re-check after any IP change.
Reverse DNS: if you send mail from the VPS IP, rDNS matters for deliverability.

If your transactional email starts landing in spam after the move, follow HostMyCode’s email deliverability troubleshooting tutorial.

For SPF/DKIM/DMARC, use the dedicated guide: SPF DKIM DMARC setup guide.

Step 11: Turn your recovery into a repeatable runbook (IaC-lite approach)

You don’t need a full automation stack to be consistent. You need known-good files and a checklist you trust.

Create a private repo (or an encrypted folder) with:

nginx/ — your sites-available configs, plus notes for PHP socket paths.
systemd/ — custom services and timers.
firewall/ — UFW rules documentation, and any IP allowlists.
restore/ — exact rsync commands, archive paths, and restore order.
dns/ — current zone records (exported), TTL choices, and providers.

Then write a single file called DR-CHECKLIST.md with the steps you actually followed.

Next time, you won’t rely on memory or half-remembered terminal history.

Troubleshooting quick wins (symptom → fastest likely fix)

502 Bad Gateway: PHP-FPM not running or wrong socket path. Check /run/php/*.sock and your fastcgi_pass.
White screen in WordPress: plugin/theme fatal error. Enable WP debug briefly and check Nginx error log.
Too many redirects: mixed HTTP/HTTPS settings. Check WordPress siteurl/home and any forced redirect rules.
Login loops: cookie/domain mismatch after temporary domain changes; clear cache and confirm correct URL settings.
Images won’t upload: filesystem ownership/permissions on wp-content/uploads.

Common log locations:

Nginx: /var/log/nginx/
PHP-FPM: /var/log/php8.x-fpm.log (varies by distro)
System: journalctl -u nginx, journalctl -u php8.x-fpm

sudo journalctl -u nginx -n 100 --no-pager
sudo journalctl -u php8.3-fpm -n 100 --no-pager

Summary: your minimum viable DR flow

If you only keep one sequence in your head, use this: provision a clean VPS → lock down SSH and the firewall → install the web stack → restore files and DB → issue SSL → cut DNS → validate the app and email.

That order prevents common detours like HSTS pain, redirect loops, and “it works on IP but not on domain” confusion.

If your business can’t tolerate long outages, choose infrastructure that supports quick rebuilds and predictable networking.

For recovery-focused hosting, start with HostMyCode VPS, and move up to HostMyCode Dedicated Servers when your workload needs isolated hardware and higher headroom.

Want a recovery plan you can actually run under pressure? Pair this runbook with infrastructure you trust. Managed VPS hosting from HostMyCode cuts the time you spend on OS maintenance and routine service upkeep, while still giving you root-level control when you need it. For fast rebuilds and predictable performance, launch on a HostMyCode VPS and keep your DR checklist somewhere you can reach it quickly.

FAQ

How fast can I realistically recover a small WordPress site on a new VPS?

If you have a recent file archive and DB dump, 30–90 minutes is realistic: provision, restore, SSL, DNS cutover, validation. If you must extract data from a failing server, plan for longer.

Should I restore from a full snapshot or rebuild from scratch?

Snapshots are fast, but they can also bring the problem back with them (bad config, compromised services). Rebuild-and-restore takes longer but gives you a cleaner baseline. For security incidents, prefer rebuild-and-restore.

What DNS TTL should I use for DR?

For most sites, 300 seconds is a good operational TTL. If you’re stable and rarely change IPs, you can raise it later. Keep the low TTL during periods of migration or infrastructure changes.

Do I need to rotate secrets after a disaster?

If the failure involved compromise or unknown access, rotate: SSH keys (or at least review authorized_keys), database passwords, WordPress salts, and any SMTP/API credentials.

What’s the biggest DR mistake on VPS hosting?

Skipping restore verification. A backup that hasn’t been restored and tested is just a file you hope works. After recovery, schedule a quarterly restore drill on a temporary VPS.