
Your VPS can be “up” and still be failing users. CPU looks fine, disk has space, and yet the API drags every few minutes. That’s usually not a single-metric problem. You need signals that line up: metrics to spot trends, logs to explain what happened, and traces to show where time is actually going.
This post walks through VPS monitoring with OpenTelemetry Collector on one Linux host using a setup that’s simple enough to run, but flexible enough to grow. You’ll collect host metrics, ingest journald logs, optionally accept application traces over OTLP, and export everything in open formats. You pick the backend. The collector is just the plumbing.
What you’ll build (a concrete 2026-friendly telemetry pipeline)
Scenario: you run a small SaaS API on a Debian 13 VPS (the same approach works on Ubuntu 24.04 and Rocky/AlmaLinux 10; paths and packages vary). The API listens on 127.0.0.1:9001 behind a reverse proxy. You want:
- Metrics: CPU, memory, filesystem, network, and basic process health.
- Logs: systemd-journald logs for your app service and reverse proxy.
- Traces (optional but recommended): OTLP over gRPC/HTTP so your app can emit spans.
- Security: collector binds locally; you forward via an authenticated channel.
You’ll run the collector as a systemd service at /etc/otelcol-contrib/config.yaml. If you enable trace ingestion, it listens only on localhost ports 4317 (OTLP gRPC) and 4318 (OTLP HTTP).
Prerequisites (keep it boring on purpose)
- A Linux VPS with sudo (examples assume Debian 13); 1 vCPU / 1–2 GB RAM is enough for light workloads.
- Outbound HTTPS allowed (to export telemetry).
- systemd-based distro (journald available).
- If you’re already hardening: UFW/nftables rules, SSH keys, and a patched kernel. If not, start with UFW Firewall Setup for a VPS in 2026.
If you want a steady environment for an always-on agent, a HostMyCode VPS is a clean fit: predictable resources, full root access, and a straightforward path from one node to a small fleet.
VPS monitoring with OpenTelemetry Collector: why the collector is the right hinge point
You can run three separate agents (one each for logs, metrics, and traces). It’s fine at first. Then you need to redact secrets, normalize labels, drop noisy fields, or send data to two destinations during a migration. That’s when “fine” turns into a mess.
The OpenTelemetry Collector gives you one place to manage the flow:
- Collect from standard sources (host metrics, journald, OTLP).
- Process and normalize (add resource attributes like
service.name, trim junk fields). - Export to the backends you choose, and switch later without re-instrumenting apps.
One pipeline is also easier to debug than three unrelated agents with three different config styles.
Architecture choices (and what this post assumes)
For a single VPS, you typically land on one of these:
- Collector on the same VPS exporting to a remote backend (OpenSearch/Elasticsearch, Grafana Cloud, Tempo, Loki, Prometheus remote_write, etc.).
- Collector on the VPS exporting to a second “observability” VPS you control.
This article assumes option #1: one server runs the app, and telemetry ships out.
If you’re building your own logging/metrics stack, you’ll still get value from the patterns in this Vector + OpenSearch guide. The collector can live alongside Vector, but you usually don’t want both tools doing the same shipping job.
Install OpenTelemetry Collector Contrib on Debian 13
On Debian, the cleanest route is the official OpenTelemetry Collector Contrib release artifact. Install it under /opt/otelcol-contrib and run it as a dedicated system user.
-
Create a user and directories:
sudo useradd --system --home /var/lib/otelcol --shell /usr/sbin/nologin otelcol sudo install -d -o otelcol -g otelcol /var/lib/otelcol sudo install -d -o root -g root /etc/otelcol-contrib -
Download a current contrib build (pick the latest stable from upstream releases in 2026). The version below is a placeholder—swap it for what you’re deploying:
OTEL_VER="0.128.0" ARCH="amd64" # use arm64 if needed curl -fsSL -o /tmp/otelcol-contrib.deb \ "https://github.com/open-telemetry/opentelemetry-collector-releases/releases/download/v${OTEL_VER}/otelcol-contrib_${OTEL_VER}_linux_${ARCH}.deb" sudo dpkg -i /tmp/otelcol-contrib.debExpected output (varies): dpkg installs a binary and sometimes a default systemd unit. If your package ships its own service user, keep that. If not, you’ll add a unit below.
Configure the collector (metrics + journald logs, with safe defaults)
This is where many guides go off the rails by trying to cover every option at once. Start small. Add receivers and processors only when you can explain why you need them.
Create /etc/otelcol-contrib/config.yaml:
sudo tee /etc/otelcol-contrib/config.yaml > /dev/null <<'YAML'
receivers:
hostmetrics:
collection_interval: 20s
scrapers:
cpu: {}
memory: {}
filesystem:
include_mount_points:
match_type: strict
mount_points: ["/", "/var"]
disk: {}
network: {}
load: {}
journald:
directory: /var/log/journal
units:
- ssh.service
- nginx.service
- myapi.service
priority: info
processors:
memory_limiter:
check_interval: 2s
limit_percentage: 70
spike_limit_percentage: 20
batch:
timeout: 5s
send_batch_size: 2048
resource:
attributes:
- key: hostmycode.env
value: "prod"
action: upsert
- key: hostmycode.role
value: "api-vps"
action: upsert
exporters:
# Replace this with your real destination.
# Option A: export to an OTLP endpoint (common for modern stacks)
otlp:
endpoint: "otel-gateway.example.net:4317"
tls:
insecure: false
headers:
Authorization: "Bearer REPLACE_ME"
# Safety valve: keep a small local copy of logs for troubleshooting export outages.
file:
path: /var/lib/otelcol/otel-fallback.json
rotation:
max_megabytes: 25
max_days: 3
max_backups: 3
service:
pipelines:
metrics:
receivers: [hostmetrics]
processors: [memory_limiter, resource, batch]
exporters: [otlp]
logs:
receivers: [journald]
processors: [memory_limiter, resource, batch]
exporters: [otlp, file]
YAML
A few choices here are deliberate:
filesystemmount filtering avoids scraping ephemeral mounts that inflate dashboards and alerts.memory_limiterprevents the collector from being the reason your VPS starts swapping.- The
fileexporter leaves a short breadcrumb trail when your remote endpoint is down. Treat it as a troubleshooting aid, not a backup strategy.
Run it under systemd (with hardening options you’ll actually keep)
First, see whether the package already shipped a unit:
systemctl status otelcol-contrib --no-pager
If it exists, inspect it and adjust paths as needed. If it doesn’t, create your own:
sudo tee /etc/systemd/system/otelcol-contrib.service > /dev/null <<'UNIT'
[Unit]
Description=OpenTelemetry Collector Contrib
After=network-online.target
Wants=network-online.target
[Service]
User=otelcol
Group=otelcol
ExecStart=/usr/bin/otelcol-contrib --config=/etc/otelcol-contrib/config.yaml
Restart=on-failure
RestartSec=3
# Practical hardening; avoid breaking journald access.
NoNewPrivileges=true
PrivateTmp=true
ProtectHome=true
ProtectSystem=strict
ReadWritePaths=/var/lib/otelcol
# journald read access: the service user must read /var/log/journal
# Add otelcol to systemd-journal group below.
[Install]
WantedBy=multi-user.target
UNIT
Add the collector user to systemd-journal so the journald receiver can read logs:
sudo usermod -aG systemd-journal otelcol
Enable and start:
sudo systemctl daemon-reload
sudo systemctl enable --now otelcol-contrib
sudo systemctl status otelcol-contrib --no-pager
Expected output: active (running). If it fails, go straight to the service logs:
journalctl -u otelcol-contrib -n 80 --no-pager
Verification: prove you’re collecting (not just “running”)
Service status only tells you the process didn’t crash. You still need to confirm ingestion and export.
-
Confirm the collector reads journald by forcing a log line:
sudo systemctl restart ssh journalctl -u otelcol-contrib -n 50 --no-pager | sed -n '1,50p'You should see receiver activity and exporter attempts. If you configured a
fileexporter, confirm it’s writing:sudo ls -lh /var/lib/otelcol/otel-fallback.json sudo tail -n 2 /var/lib/otelcol/otel-fallback.jsonExpected output: JSON lines with log records and journald fields.
-
Confirm host metrics pipeline is active:
journalctl -u otelcol-contrib -n 120 --no-pager | grep -E "hostmetrics|metrics" | tail -n 20You’re looking for scrape activity roughly every 20 seconds.
-
Confirm export success:
journalctl -u otelcol-contrib -n 200 --no-pager | grep -E "export|otl" | tail -n 30Expected output: no repeating TLS or auth errors. If you see a steady stream of exporter failures, you’re collecting data but not delivering it.
Add trace ingestion (optional): accept OTLP locally and forward upstream
If you run a Node.js, Python, or Go service, sending traces to localhost keeps egress simple and keeps credentials out of app config. This is also where correlation starts paying rent: a latency spike becomes a trace, and the trace points you to the log line that explains it.
Edit /etc/otelcol-contrib/config.yaml and add an OTLP receiver:
sudo perl -0777 -i -pe 's/receivers:\n/receivers:\n otlp:\n protocols:\n grpc:\n endpoint: 127.0.0.1:4317\n http:\n endpoint: 127.0.0.1:4318\n\n/' /etc/otelcol-contrib/config.yaml
Then add a traces pipeline (append under service.pipelines):
sudo tee -a /etc/otelcol-contrib/config.yaml > /dev/null <<'YAML'
traces:
receivers: [otlp]
processors: [memory_limiter, resource, batch]
exporters: [otlp]
YAML
Restart and confirm the sockets are listening:
sudo systemctl restart otelcol-contrib
sudo ss -ltnp | grep -E ":4317|:4318"
Expected output includes lines like:
LISTEN 0 4096 127.0.0.1:4317 ... otelcol-contrib
LISTEN 0 4096 127.0.0.1:4318 ... otelcol-contrib
If you already run a reverse proxy, keep these ports localhost-only. Don’t expose them to the internet.
Correlation tip: add service.name and deployment labels at the collector
One of the quickest ways to make traces feel useless is letting everything show up as unknown_service. You can set this in each app, but setting it at the collector gives you consistent naming across services and environments.
Add attributes to the resource processor:
processors:
resource:
attributes:
- key: service.name
value: "myapi"
action: upsert
- key: deployment.environment
value: "production"
action: upsert
Restart after changes:
sudo systemctl restart otelcol-contrib
Operational checklist: log retention, disk pressure, and “agent eating the box”
Telemetry agents behave right up until they don’t. Keep these checks in your runbook:
- Disk usage: if you keep the fallback file exporter, make sure it can’t grow without bounds. The rotation settings above cap it.
- Journald growth: if you lean on journald, set limits in
/etc/systemd/journald.conf. Pair that with HostMyCode’s log rotation best practices. - CPU spikes: batch + memory limiting prevents a lot of pain, but heavy log processing (especially regex) can still get expensive.
If you run containers on the VPS, watch cgroups v2 behavior; it affects both your app and your telemetry. This deep dive on cgroups v2 performance tuning is useful context.
Common pitfalls (the ones that waste hours)
- Collector can’t read journald: the
otelcoluser isn’t insystemd-journal, or journald logs live in persistent storage you didn’t enable. Fix the group membership and ensure/var/log/journalexists. - Export fails due to TLS: your endpoint uses a private CA. Update the system trust store or configure
ca_filein TLS settings. - Auth header silently wrong: some gateways expect
Authorization: Bearer ...; others want an API key header likeX-API-Key. Check backend docs and watch for 401/403 in collector logs. - Noisy log volume: collecting all journald logs on a busy VPS gets expensive fast. Start with a few units (like the example) and expand on purpose.
- Exposed OTLP ports: don’t bind
0.0.0.0unless you also enforce firewall rules and authentication. Localhost-only is the default for good reason.
Rollback plan (get back to a clean server quickly)
If you need to back out, do it in this order so you don’t leave a broken service behind:
-
Stop and disable the service:
sudo systemctl disable --now otelcol-contrib -
Remove the systemd unit (if you created it):
sudo rm -f /etc/systemd/system/otelcol-contrib.service sudo systemctl daemon-reload -
Remove config and state:
sudo rm -rf /etc/otelcol-contrib sudo rm -rf /var/lib/otelcol -
Uninstall package (if installed via .deb):
sudo dpkg -r otelcol-contrib || true -
Remove user (optional):
sudo userdel otelcol || true
Hardening and resilience upgrades (small changes, big stability)
Once ingestion is clean, you can make the setup tougher without changing the overall shape:
- Add a disk-backed sending queue (if your exporter supports it) so short outages don’t automatically mean dropped data.
- Ship from a bastion or gateway if policy blocks direct egress. If you want a clean access pattern across servers, read this bastion host setup for operational patterns that pair well with a central telemetry gateway.
- Backups still matter: telemetry isn’t your backup plan. If you rely on local fallback logs for debugging, keep your real backups separate. HostMyCode’s snapshot backup automation guide is a solid baseline.
If you want this setup to stay stable, start with infrastructure that doesn’t surprise you. A HostMyCode VPS gives you consistent performance for always-on collectors, and managed VPS hosting helps if you’d rather not spend your time on agent updates, systemd units, and patch cycles.
FAQ
Do I need OpenTelemetry if I only care about logs?
No. If logs are the only thing you need, a dedicated log shipper can be simpler. The collector earns its keep when you expect to add metrics and traces later without replacing tooling.
Should I expose OTLP ports publicly for remote apps?
On a single VPS, avoid it. Bind OTLP to 127.0.0.1 and have local apps send spans. If you must accept remote OTLP, put it behind mTLS or a private network plus strict firewall rules.
How much resource usage should I expect?
For host metrics plus a few journald units, usage is usually modest (tens of MB of RAM, low CPU). Log volume and heavy processing (regex, transforms, enrichment) are what change the math.
What’s the easiest way to confirm I’m not losing data?
Watch exporter error rates in collector logs, keep a small local fallback log file (as shown), and periodically verify that backend ingestion timestamps match the current time.
Next steps (keep building, don’t rewrite)
- Add alerting on “boring” signals: disk > 85%, memory pressure, and error-rate spikes.
- Instrument one endpoint in your app with OpenTelemetry SDK and confirm trace-log correlation.
- If you expect growth, move exports to a dedicated telemetry gateway node. This reduces blast radius and makes migrations easier.
Summary
VPS monitoring with OpenTelemetry Collector gets you out of dashboard guesswork. One agent collects metrics and logs (and traces if you want them), keeps ingestion ports private, and ships telemetry to the destination you choose. Once it’s running, changing backends doesn’t require redeploying your app.
If you’re setting this up for production workloads, start with a dependable server foundation. A HostMyCode VPS is a solid baseline, and managed VPS hosting can take patching and operational upkeep off your plate.