
Your “backup strategy” only becomes real the first time you restore under pressure. For a typical VPS in 2026, the most dependable setup is layered: provider snapshots for fast rollbacks, plus file-level backups for precise restores. This post walks you through VPS snapshot backup automation with a practical runbook, built-in verification, and a rollback plan that doesn’t require turning your server into a research project.
Assume a common setup: a small API and a Postgres database on a single Linux VPS. You want weekly crash-consistent disk images (snapshots) and daily encrypted file backups (excluding junk and churn) with restore tests. You also want receipts: logs, retention rules, and a repeatable way to prove the backups work.
Why snapshots and file backups solve different problems
Snapshots are a time machine for the whole disk. They’re quick to create and even quicker to roll back, which makes them perfect for “oops” moments: a bad upgrade, a broken deploy, or an accidental deletion that touched too much state.
File backups restore more slowly, but they let you be selective. They’re what you reach for when you need one config file, a single directory, or last night’s database dump without rewinding the entire VPS.
Rely on snapshots alone and you’ll eventually want file-level history. Rely on file backups alone and you’ll eventually wish you could revert the whole machine in minutes.
Prerequisites and assumptions
- A Linux VPS (Ubuntu 24.04/24.10, Debian 12/13, Rocky/AlmaLinux 9/10 all work). Examples below use Ubuntu 24.04 paths and commands.
- Root or sudo access.
- A place to store file backups (S3-compatible object storage or a backup VPS). You’ll use
resticfor encryption, retention, and verification. - Your VPS provider supports snapshots (most do). The scheduling mechanism varies by provider, so the “automation” is partly provider-side and partly your own runbook.
If you’d rather not own the OS plumbing, managed VPS hosting from HostMyCode can handle patching, baseline hardening, and backup checks while you keep control of your application.
VPS snapshot backup automation: the practical blueprint
For a single-node VPS, this policy stays simple and works well:
- Weekly snapshots (e.g., Sunday 03:15) retained for 4–6 weeks.
- Daily file backups (e.g., 02:20) retained with: 14 daily, 8 weekly, 12 monthly.
- Verification weekly: restore a small selection to a staging directory and run a database integrity check or smoke test.
- Pre-snapshot hook: quick app quiesce (optional) and database checkpoint to reduce recovery time.
Before you automate anything, make sure the VPS won’t quietly run out of disk space and sabotage the whole plan. If log growth is a risk on your box, the guide on VPS log rotation best practices in 2026 is a good companion piece.
Set up daily encrypted file backups with restic (with verification)
Restic is still a solid “boring” option in 2026: encryption by default, deduplication, fast listings, and retention rules that behave the way you expect. The goal here is to capture configs, application data, and (optionally) database dumps—without dragging caches and build artifacts along for the ride.
1) Install restic and supporting tools
sudo apt update
sudo apt install -y restic jq curl
Expected output: package install logs, ending with restic available on your PATH.
restic version
restic 0.16.x
2) Create credentials and a repository
This example uses S3-compatible storage. Create a restricted key that can access only a single bucket/prefix.
Create /etc/restic/backup.env (root-readable):
sudo install -m 0600 -o root -g root /dev/null /etc/restic/backup.env
sudo nano /etc/restic/backup.env
export RESTIC_REPOSITORY="s3:https://s3.example.net/hmc-backups/vps-api-01"
export RESTIC_PASSWORD="CHANGE_ME_TO_A_LONG_RANDOM_SECRET"
export AWS_ACCESS_KEY_ID="AKIA..."
export AWS_SECRET_ACCESS_KEY="..."
export AWS_DEFAULT_REGION="us-east-1"
Initialize the repository:
sudo bash -c 'source /etc/restic/backup.env && restic init'
created restic repository ...
3) Decide what you back up (and what you explicitly exclude)
Write down what you intend to protect, then explicitly exclude the paths that explode in size or change constantly. This keeps costs predictable and restores faster.
/etc/restic/includes.txt:
/etc
/opt/acme-api
/var/lib/postgresql
/var/lib/redis
/var/www
/etc/restic/excludes.txt:
/var/lib/postgresql/pg_wal
/opt/acme-api/tmp
/opt/acme-api/node_modules
/var/www/*/cache
/var/log/journal
Note: Excluding pg_wal is usually correct for file-level backups when you also take snapshots and/or you back up logical dumps. If you rely on physical Postgres backups for PITR, this changes. Treat this as a deliberate choice, not a default.
4) Add a database-aware step (simple and reliable)
For many VPS deployments, a daily logical dump is the right trade-off. It’s portable, straightforward to test, and it avoids filesystem-consistency edge cases.
Create /usr/local/sbin/pg_dump_daily.sh:
sudo install -m 0750 -o root -g root /dev/null /usr/local/sbin/pg_dump_daily.sh
sudo nano /usr/local/sbin/pg_dump_daily.sh
#!/usr/bin/env bash
set -euo pipefail
DUMP_DIR="/var/backups/postgres"
TS="$(date -u +%Y%m%dT%H%M%SZ)"
FILE="$DUMP_DIR/acme_${TS}.sql.gz"
install -d -m 0750 -o postgres -g postgres "$DUMP_DIR"
# Adjust DB name/user as needed
sudo -u postgres pg_dump --format=plain --no-owner --no-privileges acme \
| gzip -9 > "$FILE"
# Keep local dumps for 7 days; restic will store long-term
find "$DUMP_DIR" -type f -name 'acme_*.sql.gz' -mtime +7 -delete
Run it once:
sudo /usr/local/sbin/pg_dump_daily.sh
ls -lh /var/backups/postgres | tail -n 2
-rw-r--r-- 1 postgres postgres 18M ... acme_20260413T022000Z.sql.gz
Add /var/backups/postgres to your includes if it isn’t already:
echo "/var/backups/postgres" | sudo tee -a /etc/restic/includes.txt
5) Create the restic backup script (with retention + check)
Create /usr/local/sbin/restic_backup.sh:
sudo install -m 0750 -o root -g root /dev/null /usr/local/sbin/restic_backup.sh
sudo nano /usr/local/sbin/restic_backup.sh
#!/usr/bin/env bash
set -euo pipefail
source /etc/restic/backup.env
HOST_TAG="vps-api-01"
LOG="/var/log/restic-backup.log"
exec >>"$LOG" 2>&1
echo "[$(date -Is)] starting backup"
# Optional: create DB dump first
/usr/local/sbin/pg_dump_daily.sh
restic backup \
--host "$HOST_TAG" \
--files-from /etc/restic/includes.txt \
--exclude-file=/etc/restic/excludes.txt \
--one-file-system
echo "[$(date -Is)] applying retention"
restic forget \
--host "$HOST_TAG" \
--keep-daily 14 \
--keep-weekly 8 \
--keep-monthly 12 \
--prune
echo "[$(date -Is)] running repository check (partial)"
# Quick check nightly; full check weekly via separate timer
restic check --read-data-subset=2.5%
echo "[$(date -Is)] done"
Run it manually once (first run takes longer):
sudo /usr/local/sbin/restic_backup.sh
tail -n 20 /var/log/restic-backup.log
Expected output should include a summary like:
Files: 18234 new, 0 changed, 18190 unmodified
Dirs: 2103 new, 0 changed, 2100 unmodified
Added to the repository: 1.234 GiB
processed 2.876 GiB in 2m41s
6) Schedule it with systemd timers (not cron)
Systemd timers buy you three things cron doesn’t: better logs, better control over dependencies, and catch-up for missed runs after reboots.
Create /etc/systemd/system/restic-backup.service:
sudo nano /etc/systemd/system/restic-backup.service
[Unit]
Description=Nightly restic backup
Wants=network-online.target
After=network-online.target
[Service]
Type=oneshot
ExecStart=/usr/local/sbin/restic_backup.sh
Nice=10
IOSchedulingClass=best-effort
IOSchedulingPriority=7
Create /etc/systemd/system/restic-backup.timer:
sudo nano /etc/systemd/system/restic-backup.timer
[Unit]
Description=Run restic backup nightly
[Timer]
OnCalendar=*-*-* 02:20:00
Persistent=true
RandomizedDelaySec=10m
[Install]
WantedBy=timers.target
Enable:
sudo systemctl daemon-reload
sudo systemctl enable --now restic-backup.timer
systemctl list-timers | grep restic
Expected output: a next-run time around 02:20 with a small randomized delay.
7) Add a weekly deep verification timer
The nightly subset check catches many problems early. A periodic full read is what flushes out the nastier surprises.
/etc/systemd/system/restic-check-weekly.service:
sudo nano /etc/systemd/system/restic-check-weekly.service
[Unit]
Description=Weekly restic full check
Wants=network-online.target
After=network-online.target
[Service]
Type=oneshot
EnvironmentFile=/etc/restic/backup.env
ExecStart=/usr/bin/restic check --read-data
/etc/systemd/system/restic-check-weekly.timer:
sudo nano /etc/systemd/system/restic-check-weekly.timer
[Unit]
Description=Run restic full check weekly
[Timer]
OnCalendar=Sun *-*-* 04:40:00
Persistent=true
[Install]
WantedBy=timers.target
sudo systemctl daemon-reload
sudo systemctl enable --now restic-check-weekly.timer
Automating VPS snapshots: what you can and can’t standardize
Snapshot APIs vary by provider, so a universal script usually turns into a half-truth. The process, though, is consistent:
- Pick a consistent schedule (weekly is a good baseline).
- Ensure the snapshot name includes server ID + timestamp (you’ll thank yourself later).
- Cap retention (4–6 snapshots is typically enough).
- Document a restore procedure that includes IP/DNS implications.
If your provider offers scheduled snapshots in the control panel, start there. If it offers an API, keep the automation small and boring: one call, one log, one place to review failures (a management host, an internal tool, or a GitHub Actions runner). If you want fewer moving parts, running workloads on a reliable VPS platform like HostMyCode VPS helps keep the routine repeatable.
Verification: prove you can restore what you just backed up
Untested backups are optimistic fiction. You don’t need a full disaster-recovery rehearsal every week, but you do need a small test that catches the usual failures: wrong paths, missing secrets, broken dumps, and permission mistakes.
A small restore test (safe on production)
Create a staging directory, list snapshots, and restore a few critical items:
sudo mkdir -p /var/restore-tests/restic
sudo chmod 0700 /var/restore-tests/restic
sudo bash -c 'source /etc/restic/backup.env && restic snapshots --host vps-api-01 | head -n 20'
Pick the latest snapshot ID (example 3a1c9b7d), then restore:
sudo bash -c 'source /etc/restic/backup.env && restic restore 3a1c9b7d --target /var/restore-tests/restic --include /etc/ssh/sshd_config --include /var/backups/postgres'
Verify the restored files are actually there:
sudo test -f /var/restore-tests/restic/etc/ssh/sshd_config && echo OK
sudo ls -lh /var/restore-tests/restic/var/backups/postgres | tail -n 2
Expected output: OK and at least one acme_*.sql.gz dump.
Database restore smoke test (on a throwaway database)
Decompress a dump and load it into a temporary database to confirm it’s valid SQL and not silently truncated.
latest=$(ls -1 /var/restore-tests/restic/var/backups/postgres/acme_*.sql.gz | tail -n 1)
echo "$latest"
sudo -u postgres createdb acme_restore_test
zcat "$latest" | sudo -u postgres psql -d acme_restore_test > /tmp/restore_test.log
sudo -u postgres psql -d acme_restore_test -c "SELECT count(*) FROM information_schema.tables;"
sudo -u postgres dropdb acme_restore_test
Expected output: a non-zero table count, and no SQL errors in /tmp/restore_test.log.
If you want a stronger validation path, restore a snapshot onto a separate VPS and hit your app’s health endpoint. The systemd approach in systemd socket activation for FastAPI helps here because it makes restarts and service bring-up less fragile.
Common pitfalls (the ones that quietly break restores)
- Backing up the wrong paths. Keep an explicit includes file. If you can’t explain why a directory is in scope, don’t back it up.
- Permissions drift. If your app depends on ownership, modes, or ACLs, test restoring those paths and validate with
stat. - Database dumps without consistent settings. Use
--no-owner --no-privilegesfor portability unless you have a specific reason not to. - Snapshot restore changes IPs. Plan for DNS TTL and reverse proxy configuration. If you front apps with a proxy, the guide on Caddy reverse proxy on a VPS helps keep routing and TLS predictable after a restore.
- Log growth blocks backups. Full disks create partial backups and broken dumps. Backups need log retention and disk monitoring to stay reliable.
Rollback plan: how you recover without improvising
Write this down somewhere your team will actually find it. “We’ll figure it out” turns into “we’re down” with surprising speed.
Rollback option A: restore from snapshot (fastest)
- Stop writes if possible (maintenance page, disable background workers).
- Restore the latest known-good snapshot from your provider panel/API.
- Boot, verify system health:
systemctl --failed, disk:df -h, app health endpoint. - If IP changed, update DNS or your load balancer target.
This gets you back online quickly. You may lose data between the snapshot time and the incident, so be honest about your RPO.
Rollback option B: restore specific files with restic (safest for small mistakes)
- Identify what changed (a config, an env file, a directory).
- Restore only what you need into a temporary location first.
- Diff and replace deliberately.
sudo bash -c 'source /etc/restic/backup.env && restic restore latest --target /var/tmp/restore --include /etc/nginx/nginx.conf'
Rollback option C: rebuild VPS, then restore app + data
If the OS is compromised or the system state can’t be trusted, rebuilding is often faster than trying to clean up perfectly.
- Provision a fresh VPS.
- Apply baseline hardening (SSH keys, firewall, updates). If you want a structured checklist, keep your hardening aligned with this production hardening guide.
- Install dependencies and restore files from restic.
- Restore database from the latest verified dump.
HostMyCode can help keep cutovers controlled with migration help if you’d rather not do an ad-hoc rebuild during an incident.
Operational checklists you’ll actually use
Weekly checklist (10 minutes)
- Confirm last restic run succeeded:
systemctl status restic-backup.serviceor review/var/log/restic-backup.log. - Confirm weekly restic check ran:
systemctl status restic-check-weekly.service. - Confirm snapshot exists in provider panel and naming matches policy.
- Run a small restore test (one config file + latest DB dump).
Monthly checklist (30 minutes)
- Restore into a staging VPS and run a full smoke test (login flow, API health, background jobs).
- Review retention costs and adjust keep rules.
- Rotate restic password if your policy requires it (and document the rotation process).
Next steps (so this keeps working six months from now)
- Add alerting on backup failures (email, Slack, or your monitoring system). At minimum, alert on a missing timer run in 24 hours.
- Track backup duration and size. Sudden growth usually means a new directory slipped into scope.
- If you centralize logs, make sure backup scripts don’t leak secrets in output. Consider structured logging and scrubbing.
- Document restore steps in your repo alongside infrastructure code, not in a wiki nobody opens.
If you’re putting layered backups on a production VPS, start with predictable infrastructure and storage performance you can count on. HostMyCode VPS works well for snapshot-based rollbacks, and managed VPS hosting can cover routine backup checks and restore validation while you focus on the application.
FAQ
How often should I take VPS snapshots in 2026?
Weekly snapshots are a strong baseline for single-node workloads. If you deploy daily or take schema changes often, consider daily snapshots with a shorter retention window (7–14).
Are VPS snapshots enough without restic or file backups?
Snapshots are great for full-server rollback, but they’re blunt. File backups give you point-in-time recovery for specific files and portable database dumps. Using both avoids the most common “we restored too much” problems.
Will restic back up a live Postgres data directory safely?
Not reliably unless you’re doing Postgres-aware physical backups (and you know exactly what you’re doing). For many VPS setups, use logical dumps for daily backups and rely on snapshots for fast whole-disk recovery.
What’s the simplest way to verify backups without a full staging environment?
Restore a few critical files into a temporary directory and load your latest database dump into a throwaway database. This catches missing paths, broken dumps, and permission issues quickly.
How do I keep backup jobs from impacting performance?
Run them during low-traffic hours, use systemd Nice and I/O priorities, and exclude churn-heavy paths (caches, build artifacts, WAL directories). If your workload is CPU-tight, consider a larger plan or isolate the database.
Summary
VPS snapshot backup automation works best as a layered system: snapshots for fast “rewind the server” recovery, and restic for encrypted, deduplicated, verifiable file backups. The real advantage is consistency—retention rules you follow, weekly verification you don’t skip, and a rollback plan you can execute at 3 a.m.
If you want a VPS setup where snapshots and restore testing stay predictable over time, run it on HostMyCode VPS and keep your backup routines in systemd timers with logs you can audit.