Linux VPS monitoring with Prometheus and Grafana in 2026: a practical, low-noise setup

You can’t fix what you can’t see. Linux VPS monitoring with Prometheus and Grafana is still one of the clearest ways to answer the questions that matter during an incident: “Are we CPU-starved, memory-bound, running out of disk, or dropping packets?” This walkthrough aims for a setup that stays quiet on good days and only gets loud when something needs your attention.

Assume a simple, real-world box: one Ubuntu 24.04/25.04-class VPS (systemd, no Kubernetes) running a small API and a worker. You want host metrics, a dashboard you’ll actually trust, and alerts that don’t ping you at 3 a.m. for non-events. We’ll also leave room for a couple of app probes later, without turning the server into a science project.

If this is production, size for headroom. A 2 vCPU / 4 GB instance is a sensible floor once you add Prometheus, Grafana, and your app together. HostMyCode’s HostMyCode VPS plans map well to this “single box, real monitoring” setup.

What you’ll build (and what you won’t)

Prometheus scraping host metrics via node_exporter
Grafana dashboards for CPU, memory, disk, and network
Alertmanager with a small set of actionable alerts (email example)
Security: firewall rules, local-only bind where possible, and basic auth on Grafana

What we’re skipping on purpose: multi-node federation, long-term storage (Thanos/Mimir), and full tracing. You can add those later without ripping this apart. If you want a vendor-neutral “logs + metrics + traces” baseline instead, see VPS monitoring with OpenTelemetry Collector on Linux (2026).

Prerequisites

An Ubuntu server (Ubuntu 24.04 LTS or newer recommended) with sudo
Open ports: 22/tcp for SSH; optionally 443/tcp if you’ll put Grafana behind a reverse proxy
A domain name if you want HTTPS and a friendly URL (optional). You can manage DNS via HostMyCode domains.
Basic familiarity with systemd and editing files in /etc

Architecture choices that reduce noise

Most “Prometheus + Grafana on a VPS” guides go off the rails in predictable ways: they ship hundreds of alerts, or they expose port 3000 to the internet with weak credentials. In 2026, that’s not an “oops.” It’s a breach waiting to happen.

Bind Prometheus and Alertmanager to localhost and only expose Grafana (or expose nothing and tunnel over SSH).
Start with 6–10 alerts max, tuned to your instance size and disk layout.
Use systemd services instead of ad-hoc background processes, so restarts and logs behave consistently.

Install Prometheus, node_exporter, Grafana, and Alertmanager

Ubuntu’s repos are fine for many things, but monitoring components can lag. For a VPS you care about, prefer official packaging where it matters so you get timely updates and predictable layouts.

1) Create a dedicated directory layout

sudo mkdir -p /etc/prometheus /var/lib/prometheus /etc/alertmanager
sudo chown -R prometheus:prometheus /var/lib/prometheus 2>/dev/null || true

We’ll let packages create users where appropriate; the command above is safe even if prometheus doesn’t exist yet.

2) Install node_exporter

Keep node_exporter boring: small, stable, and always running.

sudo apt update
sudo apt install -y prometheus-node-exporter

Verify it’s up and listening on 9100:

systemctl status prometheus-node-exporter --no-pager
ss -ltnp | grep 9100

You should see a LISTEN line similar to:

LISTEN 0 4096 0.0.0.0:9100 ... prometheus-node-exporter

3) Install Prometheus

sudo apt install -y prometheus

Prometheus typically listens on 9090. We’ll bind it to localhost shortly.

4) Install Grafana from Grafana’s APT repo

sudo apt install -y apt-transport-https software-properties-common wget
wget -q -O - https://packages.grafana.com/gpg.key | sudo gpg --dearmor -o /usr/share/keyrings/grafana.gpg
echo "deb [signed-by=/usr/share/keyrings/grafana.gpg] https://packages.grafana.com/oss/deb stable main" | sudo tee /etc/apt/sources.list.d/grafana.list
sudo apt update
sudo apt install -y grafana

Start it:

sudo systemctl enable --now grafana-server
systemctl status grafana-server --no-pager

5) Install Alertmanager

On some Ubuntu versions Alertmanager ships as prometheus-alertmanager.

sudo apt install -y prometheus-alertmanager

Check status:

systemctl status prometheus-alertmanager --no-pager

Configure Prometheus scraping (and keep it private)

We’ll scrape node_exporter and Prometheus itself, then make Prometheus local-only so it isn’t reachable from the public internet.

1) Edit /etc/prometheus/prometheus.yml

sudo cp /etc/prometheus/prometheus.yml /etc/prometheus/prometheus.yml.bak.$(date +%F)
sudo nano /etc/prometheus/prometheus.yml

Use a minimal config like this (adjust comments to taste):

global:
  scrape_interval: 15s
  evaluation_interval: 15s

scrape_configs:
  - job_name: "prometheus"
    static_configs:
      - targets: ["127.0.0.1:9090"]

  - job_name: "node"
    static_configs:
      - targets: ["127.0.0.1:9100"]

Restart Prometheus:

sudo systemctl restart prometheus
sudo journalctl -u prometheus -n 50 --no-pager

2) Bind Prometheus to 127.0.0.1

The Prometheus systemd unit usually reads extra flags from /etc/default/prometheus or a systemd drop-in. First, check what your unit is doing:

sudo systemctl cat prometheus | sed -n '1,120p'

If your Prometheus uses /etc/default/prometheus, set:

sudo nano /etc/default/prometheus

Append or adjust:

ARGS="--web.listen-address=127.0.0.1:9090"

If your unit doesn’t use /etc/default, create a drop-in instead:

sudo systemctl edit prometheus

Add:

[Service]
ExecStart=
ExecStart=/usr/bin/prometheus \
  --config.file=/etc/prometheus/prometheus.yml \
  --storage.tsdb.path=/var/lib/prometheus \
  --web.listen-address=127.0.0.1:9090

Reload and restart:

sudo systemctl daemon-reload
sudo systemctl restart prometheus
ss -ltnp | grep 9090

You want to see 127.0.0.1:9090, not 0.0.0.0:9090.

Add Alertmanager and a small, useful rule set

Alert fatigue isn’t a rite of passage. On a single VPS you mostly care about two categories: “something is down” and “something is running out.”

1) Configure Alertmanager (email example)

Edit:

sudo cp /etc/alertmanager/alertmanager.yml /etc/alertmanager/alertmanager.yml.bak.$(date +%F)
sudo nano /etc/alertmanager/alertmanager.yml

Example (use your SMTP provider; many teams use a transactional account dedicated to alerts):

global:
  smtp_smarthost: "smtp.examplemail.net:587"
  smtp_from: "alerts@yourdomain.example"
  smtp_auth_username: "alerts@yourdomain.example"
  smtp_auth_password: "REPLACE_WITH_APP_PASSWORD"

route:
  receiver: "email-oncall"
  group_by: ["alertname", "instance"]
  group_wait: 30s
  group_interval: 5m
  repeat_interval: 4h

receivers:
  - name: "email-oncall"
    email_configs:
      - to: "oncall@yourdomain.example"
        send_resolved: true

Restart Alertmanager:

sudo systemctl restart prometheus-alertmanager
sudo journalctl -u prometheus-alertmanager -n 50 --no-pager

2) Create Prometheus alert rules

Create a new rules file:

sudo nano /etc/prometheus/rules-vps.yml

Paste a focused set (tune thresholds to your box):

groups:
  - name: vps.rules
    rules:
      - alert: NodeExporterDown
        expr: up{job="node"} == 0
        for: 2m
        labels:
          severity: page
        annotations:
          summary: "node_exporter is down"
          description: "Prometheus cannot scrape node_exporter on {{ $labels.instance }}"

      - alert: HighCPUUsage
        expr: (1 - avg by(instance) (rate(node_cpu_seconds_total{mode="idle"}[5m]))) > 0.90
        for: 10m
        labels:
          severity: warn
        annotations:
          summary: "CPU > 90% for 10m"
          description: "Sustained CPU pressure on {{ $labels.instance }}"

      - alert: LowMemoryAvailable
        expr: (node_memory_MemAvailable_bytes / node_memory_MemTotal_bytes) < 0.10
        for: 10m
        labels:
          severity: page
        annotations:
          summary: "Memory available < 10%"
          description: "Likely OOM risk on {{ $labels.instance }}"

      - alert: DiskSpaceLowRoot
        expr: (node_filesystem_avail_bytes{mountpoint="/",fstype!~"tmpfs|overlay"} / node_filesystem_size_bytes{mountpoint="/",fstype!~"tmpfs|overlay"}) < 0.12
        for: 15m
        labels:
          severity: page
        annotations:
          summary: "Root disk free < 12%"
          description: "Clean up disk or grow volume on {{ $labels.instance }}"

      - alert: DiskWillFillSoonRoot
        expr: predict_linear(node_filesystem_free_bytes{mountpoint="/",fstype!~"tmpfs|overlay"}[6h], 24*3600) < 0
        for: 30m
        labels:
          severity: warn
        annotations:
          summary: "Root disk on track to fill within ~24h"
          description: "Write rate suggests {{ $labels.instance }} will run out of space soon"

      - alert: HighNetworkRetransmits
        expr: rate(node_netstat_Tcp_RetransSegs[5m]) > 50
        for: 10m
        labels:
          severity: warn
        annotations:
          summary: "High TCP retransmits"
          description: "Potential packet loss or congestion on {{ $labels.instance }}"

Now wire rules + Alertmanager into Prometheus. Edit /etc/prometheus/prometheus.yml:

sudo nano /etc/prometheus/prometheus.yml

Add:

rule_files:
  - /etc/prometheus/rules-vps.yml

alerting:
  alertmanagers:
    - static_configs:
        - targets: ["127.0.0.1:9093"]

Restart Prometheus:

sudo systemctl restart prometheus

If you want a deeper playbook for disk pressure, this pairs well with VPS disk space troubleshooting (2026).

Grafana: add Prometheus datasource and import a sane dashboard

Grafana’s UI shifts from release to release, but the steps below stay basically the same.

1) Secure initial access

Grafana listens on port 3000 by default. If inbound access isn’t restricted yet, don’t leave 3000 open to the internet while you “get around to it.”

The quickest safe option is an SSH tunnel from your laptop:

ssh -L 3000:127.0.0.1:3000 youruser@your-vps-ip

Then open http://localhost:3000 locally. Log in and change the admin password immediately.

2) Add Prometheus datasource

Grafana → Connections → Data sources → Add data source → Prometheus
URL: http://127.0.0.1:9090
Save & test

You should get “Data source is working”. If it fails, confirm Prometheus is bound to 127.0.0.1:9090 and the service is running.

3) Import a node_exporter dashboard

Dashboard IDs come and go, so don’t bake one into your runbook. Pick a maintained “Node Exporter Full”-style dashboard from Grafana’s dashboard library and confirm it’s built on the metrics you actually have:

node_cpu_seconds_total
node_memory_MemAvailable_bytes
node_filesystem_avail_bytes

After importing, set the dashboard variable job to node (or whatever your scrape job is named).

Expose Grafana safely (two options)

You’ll usually land on one of these patterns:

Private-only Grafana over SSH or a VPN (smallest attack surface).
Public HTTPS Grafana behind Nginx/Caddy with authentication and TLS.

If you want private admin access without opening extra ports, pair this with Tailscale VPS VPN setup (2026).

Option A: keep Grafana private and close port 3000

Use UFW to only allow SSH, and bind Grafana to localhost.

Edit /etc/grafana/grafana.ini:

sudo nano /etc/grafana/grafana.ini

Set:

[server]
http_addr = 127.0.0.1
http_port = 3000

Restart:

sudo systemctl restart grafana-server
ss -ltnp | grep 3000

Expected: 127.0.0.1:3000.

Option B: publish Grafana behind Nginx with HTTPS

If you already run Nginx as a reverse proxy, give Grafana its own vhost and put it behind TLS plus basic auth. If you need a clean multi-app proxy baseline first, see Nginx reverse proxy on a VPS (2026).

Install Nginx and htpasswd tooling:

sudo apt install -y nginx apache2-utils

Create credentials:

sudo htpasswd -c /etc/nginx/.htpasswd-grafana opsadmin

Create /etc/nginx/sites-available/grafana.ops.example:

server {
  listen 80;
  server_name grafana.ops.example;

  location / {
    auth_basic "Grafana";
    auth_basic_user_file /etc/nginx/.htpasswd-grafana;

    proxy_set_header Host $host;
    proxy_set_header X-Real-IP $remote_addr;
    proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
    proxy_set_header X-Forwarded-Proto $scheme;

    proxy_pass http://127.0.0.1:3000;
  }
}

Enable and reload:

sudo ln -s /etc/nginx/sites-available/grafana.ops.example /etc/nginx/sites-enabled/
sudo nginx -t
sudo systemctl reload nginx

Then obtain TLS certificates (Let’s Encrypt) using Certbot if you want public HTTPS. Keep Grafana itself bound to localhost, and only expose 80/443.

Verification: prove metrics, dashboards, and alerts work

Don’t stop at “the services are running.” You want to confirm the whole chain: exporter → Prometheus → Grafana → Alertmanager.

1) Verify Prometheus targets

From the VPS:

curl -s http://127.0.0.1:9090/-/healthy && echo
curl -s http://127.0.0.1:9090/api/v1/targets | head

Expected: the health endpoint returns “Prometheus is Healthy.” The targets JSON should include node_exporter with health":"up".

2) Verify a query returns data

curl -sG http://127.0.0.1:9090/api/v1/query --data-urlencode 'query=up{job="node"}'

Expected: a JSON payload with value ending in "1".

3) Verify Grafana can query Prometheus

In Grafana’s Explore view, run:

100 - (avg(rate(node_cpu_seconds_total{mode="idle"}[5m])) * 100)

You should get a CPU usage time series back.

4) Trigger a test alert (safe)

Create a temporary alert that always fires, confirm delivery, then delete it. That proves the delivery path without putting artificial load on the server.

Add to /etc/prometheus/rules-vps.yml:

      - alert: AlertPipelineTest
        expr: vector(1)
        for: 1m
        labels:
          severity: warn
        annotations:
          summary: "Test alert"
          description: "If you received this, Prometheus → Alertmanager delivery works."

Restart Prometheus and watch Alertmanager logs:

sudo systemctl restart prometheus
sudo journalctl -u prometheus-alertmanager -f

After you receive the email, remove the test alert and restart Prometheus again.

Common pitfalls (and how to avoid them)

Prometheus scraping 0.0.0.0 targets: if you bind services to localhost but mix localhost and 127.0.0.1 in configs, you’ll waste time on avoidable debugging. Use 127.0.0.1 everywhere.
Disk alerts firing instantly on small volumes: tune to your disk reality. On a 30 GB root volume, “12% free” is ~3.6 GB. That may be fine or it may be a crisis, depending on logs and containers.
Grafana exposed publicly with weak auth: don’t expose port 3000 directly. Put it behind HTTPS + basic auth (or SSO), and restrict access by IP/VPN if you can.
Email alerts silently failing: SMTP auth issues often show up only in logs. Check journalctl -u prometheus-alertmanager and validate delivery with the temporary rule.
node_exporter metrics too broad: multiple disks and mounts can make dashboards look “wrong.” Filter filesystem panels to real mountpoints and exclude tmpfs/overlay.

Rollback plan (clean and quick)

If you decide this isn’t the direction you want, you can remove it cleanly.

Stop services:

sudo systemctl disable --now grafana-server prometheus prometheus-node-exporter prometheus-alertmanager

Remove packages:

sudo apt remove --purge -y grafana prometheus prometheus-node-exporter prometheus-alertmanager
sudo apt autoremove -y

Remove config/data directories (only if you’re sure):

sudo rm -rf /etc/prometheus /var/lib/prometheus /etc/alertmanager /var/lib/grafana

If you changed Nginx, remove the Grafana vhost file and reload Nginx.

Next steps: make it production-grade without getting complex

Add log hygiene so metrics aren’t your only early warning. Start with VPS log rotation best practices in 2026.
Pin a few service-level metrics: HTTP 5xx rate, request latency, queue depth. Skip the 50-panel sprawl; add the five graphs you’ll use in an incident.
Backups matter even for monitoring. If dashboards are critical, back up /var/lib/grafana/grafana.db.
Scale up thoughtfully: if you need months of retention, move long-term storage to a dedicated system. If you’re already planning a bigger footprint, consider a larger managed VPS hosting plan so upgrades, patching, and monitoring upkeep don’t eat your engineering time.

Summary

Linux VPS monitoring with Prometheus and Grafana stays useful when you keep it small, keep it private by default, and tune it to the one box you run. Start with node_exporter, a short alert list, and one dashboard you trust. You’ll catch disk pressure, memory exhaustion, and network trouble earlier—without turning your inbox into a firehose.

If you’re setting this up on a new server, pick a VPS with enough headroom for both the app and the monitoring stack. HostMyCode offers Affordable & Reliable Hosting with plans that fit single-instance production services and the operational tooling that comes with them.

If you want monitoring you can actually depend on, run it on a VPS with predictable performance and enough memory for Grafana to breathe. Start with a HostMyCode VPS, and if you’d rather offload routine ops, look at managed VPS hosting for patching and baseline hardening.

FAQ

Should I run Prometheus and Grafana on the same VPS I’m monitoring?

For a single VPS, yes—this is common and perfectly reasonable. If the box goes down, you lose monitoring too, but you also keep the setup simple. Once you have two or more servers, move monitoring to a separate instance.

How much RAM do Prometheus and Grafana need in 2026?

On one VPS with a 15s scrape interval and a handful of targets, Prometheus often sits around ~300–600 MB RAM. Grafana varies, but budgeting ~300–700 MB is usually safe. Leave at least 1–2 GB of headroom for your app and the OS page cache.

Is it safe to expose Grafana on the public internet?

It can be, but don’t expose port 3000 directly. Put Grafana behind HTTPS, add authentication (basic auth or SSO), and restrict access by IP/VPN where possible. For most teams, an SSH tunnel is the simplest secure default.

Why are my disk alerts noisy?

Noise usually comes from alerting on the wrong mountpoint or including tmpfs/overlay mounts. Filter filesystem metrics to your real volumes, then tune thresholds to your disk size and growth rate.

What’s the fastest way to test that alerts deliver?

Add a temporary always-firing rule (vector(1)), wait for delivery, then delete it. It verifies the whole pipeline without stressing the server.