Observability on VyOS: Logs, Metrics, and Backups That Matter

Your router is infrastructure. It deserves the same observability as any production system. When something breaks at 3 AM, “I don’t know what happened” isn’t an acceptable answer. Logs, metrics, and configuration backups turn mysterious failures into diagnosable incidents.

This guide covers practical observability for VyOS — what to capture, where to store it, and how to use it when things go wrong.

The Logging Strategy

What to Log

Not all logs are equal. High-value logs for troubleshooting:

Log TypeWhy It Matters
Firewall dropsShows blocked traffic, attack attempts, misconfigurations
Interface state changesLink up/down events, carrier changes
BGP/routing changesRoute flaps, peer state changes
AuthenticationSSH login attempts, successful and failed
Configuration changesWho changed what, when
System errorsKernel messages, service failures

Basic Logging Setup

Terminal window
configure
# Enable system logging
set system syslog global facility all level 'info'
set system syslog global facility protocols level 'debug'
# Log to local file
set system syslog file messages facility all level 'notice'
set system syslog file auth facility auth level 'info'
set system syslog file firewall facility all level 'debug'
commit

Local logs disappear when the router dies. Send logs to a remote syslog server:

Terminal window
configure
# Remote syslog server
set system syslog host 10.0.0.100 facility all level 'info'
set system syslog host 10.0.0.100 protocol 'udp'
set system syslog host 10.0.0.100 port '514'
# For TLS-encrypted syslog (rsyslog with TLS)
set system syslog host logs.example.com protocol 'tcp'
set system syslog host logs.example.com port '6514'
commit

Popular syslog receivers:

  • rsyslog: Standard Linux syslog daemon
  • Graylog: Full log management platform
  • Loki: Lightweight, Prometheus-style logs
  • Vector: Modern log aggregation

Firewall Logging

Firewall logs are crucial. Log all drops, and selectively log accepts:

Terminal window
configure
# Log dropped packets
set firewall ipv4 name WAN-TO-LAN default-action 'drop'
set firewall ipv4 name WAN-TO-LAN default-log
# Log specific rules
set firewall ipv4 name WAN-TO-LAN rule 100 action 'drop'
set firewall ipv4 name WAN-TO-LAN rule 100 log
set firewall ipv4 name WAN-TO-LAN rule 100 description 'Log and drop invalid'
set firewall ipv4 name WAN-TO-LAN rule 100 state 'invalid'
# Log successful SSH (for audit)
set firewall ipv4 name LAN-LOCAL rule 50 log
set firewall ipv4 name LAN-LOCAL rule 50 description 'Log SSH access'
commit

Warning: Don’t log everything at accept. High-traffic rules logging can overwhelm storage and CPU.

Reading Logs

Terminal window
# Recent logs
show log
# Filtered logs
show log | match firewall
show log | match -i error
# Tail logs in real-time
monitor log
# Specific log file
show log file firewall

Metrics and Monitoring

SNMP for Traditional Monitoring

If you have Zabbix, PRTG, LibreNMS, or similar:

Terminal window
configure
# SNMP v2c (simple but less secure)
set service snmp community public authorization 'ro'
set service snmp community public network '10.0.0.0/24'
set service snmp listen-address 10.0.0.1
# SNMP v3 (recommended)
set service snmp v3 user monitor auth encrypted-password 'authpassword'
set service snmp v3 user monitor auth type 'sha'
set service snmp v3 user monitor privacy encrypted-password 'privpassword'
set service snmp v3 user monitor privacy type 'aes'
set service snmp v3 user monitor group 'monitor-group'
set service snmp v3 group monitor-group mode 'ro'
set service snmp v3 group monitor-group view 'monitor-view'
set service snmp v3 view monitor-view oid '.1'
commit

Prometheus/Node Exporter

For modern monitoring stacks:

Terminal window
# VyOS doesn't have native Prometheus exporter, but you can:
# 1. Install node_exporter via container
# 2. Use SNMP exporter with Prometheus
# 3. Script custom metrics export
# Example: expose metrics via simple script
# Create /config/scripts/metrics.sh

A simple metrics approach:

#!/bin/bash
# /config/scripts/metrics.sh - run via cron or http server
echo "# HELP vyos_interface_rx_bytes Interface received bytes"
echo "# TYPE vyos_interface_rx_bytes counter"
for iface in eth0 eth1 eth2; do
rx=$(cat /sys/class/net/$iface/statistics/rx_bytes 2>/dev/null || echo 0)
echo "vyos_interface_rx_bytes{interface=\"$iface\"} $rx"
done
echo "# HELP vyos_firewall_dropped Firewall dropped packets"
echo "# TYPE vyos_firewall_dropped counter"
# Parse from iptables -L -v -n

Health Checks

Monitor critical functions:

Terminal window
# Interface status
show interfaces
# Routing table
show ip route
# Firewall counters
show firewall
# VPN status
show vpn ipsec sa
show wireguard peers
# System resources
show system memory
show system cpu
show system storage

Automate these checks and alert on anomalies.

Configuration Backup

Manual Backup

Terminal window
# Show configuration (can be piped to file)
show configuration commands
# Save to file
show configuration commands > /config/backup/config-$(date +%Y%m%d).txt
# Compare configuration files
diff /config/backup/config-old.txt /config/backup/config-new.txt

Automated Backup Script

/config/scripts/backup-config.sh
#!/bin/bash
BACKUP_DIR="/config/backup"
DATE=$(date +%Y%m%d-%H%M)
KEEP_DAYS=30
# Create backup
/opt/vyatta/bin/vyatta-op-cmd-wrapper show configuration commands > "$BACKUP_DIR/vyos-config-$DATE.txt"
# Clean old backups
find "$BACKUP_DIR" -name "vyos-config-*.txt" -mtime +$KEEP_DAYS -delete
# Optional: copy to remote server
# scp "$BACKUP_DIR/vyos-config-$DATE.txt" backup@server:/backups/vyos/

Schedule it:

Terminal window
set system task-scheduler task backup-config executable path '/config/scripts/backup-config.sh'
set system task-scheduler task backup-config interval '1d'

Remote Backup

Send backups off-device:

#!/bin/bash
# Backup to remote server via SCP
REMOTE_USER="backup"
REMOTE_HOST="10.0.0.100"
REMOTE_PATH="/backups/vyos"
CONFIG_FILE="/tmp/vyos-config-$(date +%Y%m%d).txt"
# Generate config
/opt/vyatta/bin/vyatta-op-cmd-wrapper show configuration commands > "$CONFIG_FILE"
# Send to remote
scp -i /config/auth/backup_key "$CONFIG_FILE" "${REMOTE_USER}@${REMOTE_HOST}:${REMOTE_PATH}/"
# Cleanup
rm "$CONFIG_FILE"

Git-Based Configuration Management

For infrastructure-as-code approach:

/config/scripts/git-backup.sh
#!/bin/bash
cd /config
git add -A
git commit -m "Config backup $(date +%Y%m%d-%H%M)"
git push origin main

Initialize git in /config:

Terminal window
cd /config
git init
git remote add origin git@github.com:yourorg/vyos-config.git

This gives you:

  • Full version history
  • Diff between any versions
  • Blame to see who changed what
  • Rollback to any point

Configuration Diff

Always diff before committing changes:

Terminal window
configure
# Make some changes
set interfaces ethernet eth0 description 'NEW-WAN'
# See what would change
compare
# Discard if wrong
discard
# Or commit if correct
commit

For historical comparison:

Terminal window
# Compare running config with saved boot config (in configure mode)
configure
compare saved
exit
# Compare two backup files
diff /config/backup/config-old.txt /config/backup/config-new.txt

Alerting

Simple Email Alerts

/config/scripts/alert.sh
#!/bin/bash
SUBJECT="$1"
MESSAGE="$2"
RECIPIENT="admin@example.com"
echo "$MESSAGE" | mail -s "$SUBJECT" "$RECIPIENT"

Integrate with monitoring:

/config/scripts/wan-monitor.sh
#!/bin/bash
if ! ping -c 3 -W 5 8.8.8.8 > /dev/null 2>&1; then
/config/scripts/alert.sh "WAN DOWN" "Primary WAN unreachable at $(date)"
fi

Webhook Alerts (Slack, Discord, PagerDuty)

/config/scripts/webhook-alert.sh
#!/bin/bash
WEBHOOK_URL="https://hooks.slack.com/services/xxx"
MESSAGE="$1"
curl -X POST -H 'Content-type: application/json' \
--data "{\"text\":\"$MESSAGE\"}" \
"$WEBHOOK_URL"

What to Monitor

Essential metrics for router health:

MetricWarning ThresholdCritical Threshold
CPU usage> 70%> 90%
Memory usage> 80%> 95%
Interface errors> 0.1%> 1%
Firewall drops/secDepends on baselineSudden 10x increase
BGP peer stateAny changeDown
VPN tunnel stateFlappingDown
Disk usage> 80%> 95%
Config changesAnyUnexpected

Disaster Recovery Checklist

When everything fails, you need:

  1. Configuration backup (tested restore)
  2. Firmware/image backup (same VyOS version)
  3. Documented procedure (how to restore)
  4. Out-of-band access (console, IPMI, if available)

Test Your Backups

Terminal window
# Periodically test restore
# On a test instance:
configure
load /config/backup/vyos-config-latest.txt
compare
# Review changes
commit

A backup you’ve never tested restoring is not a backup.

Complete Observability Setup

Terminal window
# === Syslog ===
set system syslog global facility all level 'info'
set system syslog host 10.0.0.100 facility all level 'info'
set system syslog host 10.0.0.100 protocol 'udp'
set system syslog file messages facility all level 'notice'
# === SNMP ===
set service snmp community monitoring authorization 'ro'
set service snmp community monitoring network '10.0.0.0/24'
set service snmp listen-address 10.0.0.1
set service snmp location 'Network Closet'
set service snmp contact 'admin@example.com'
# === Firewall Logging ===
set firewall ipv4 name WAN-TO-LAN default-log
set firewall ipv4 name WAN-LOCAL default-log
# === Scheduled Tasks ===
set system task-scheduler task backup-config executable path '/config/scripts/backup-config.sh'
set system task-scheduler task backup-config interval '1d'
set system task-scheduler task wan-monitor executable path '/config/scripts/wan-monitor.sh'
set system task-scheduler task wan-monitor interval '5m'

The Lesson

A router without observability is like running production without monitoring — you’ll only know something’s wrong when users complain, and you’ll have no data to diagnose it.

The minimum viable observability:

  1. Remote syslog: Logs survive device failure
  2. Firewall logging: See what’s being blocked
  3. Configuration backups: Automated, tested, off-device
  4. Health monitoring: Alert before users notice

Everything else builds on this foundation. Start simple, add complexity as needed. The goal isn’t comprehensive monitoring — it’s having the data you need when things break.