Your router is infrastructure. It deserves the same observability as any production system. When something breaks at 3 AM, “I don’t know what happened” isn’t an acceptable answer. Logs, metrics, and configuration backups turn mysterious failures into diagnosable incidents.
This guide covers practical observability for VyOS — what to capture, where to store it, and how to use it when things go wrong.
The Logging Strategy
What to Log
Not all logs are equal. High-value logs for troubleshooting:
| Log Type | Why It Matters |
|---|---|
| Firewall drops | Shows blocked traffic, attack attempts, misconfigurations |
| Interface state changes | Link up/down events, carrier changes |
| BGP/routing changes | Route flaps, peer state changes |
| Authentication | SSH login attempts, successful and failed |
| Configuration changes | Who changed what, when |
| System errors | Kernel messages, service failures |
Basic Logging Setup
configure
# Enable system loggingset system syslog global facility all level 'info'set system syslog global facility protocols level 'debug'
# Log to local fileset system syslog file messages facility all level 'notice'set system syslog file auth facility auth level 'info'set system syslog file firewall facility all level 'debug'
commitRemote Logging (Recommended)
Local logs disappear when the router dies. Send logs to a remote syslog server:
configure
# Remote syslog serverset system syslog host 10.0.0.100 facility all level 'info'set system syslog host 10.0.0.100 protocol 'udp'set system syslog host 10.0.0.100 port '514'
# For TLS-encrypted syslog (rsyslog with TLS)set system syslog host logs.example.com protocol 'tcp'set system syslog host logs.example.com port '6514'
commitPopular syslog receivers:
- rsyslog: Standard Linux syslog daemon
- Graylog: Full log management platform
- Loki: Lightweight, Prometheus-style logs
- Vector: Modern log aggregation
Firewall Logging
Firewall logs are crucial. Log all drops, and selectively log accepts:
configure
# Log dropped packetsset firewall ipv4 name WAN-TO-LAN default-action 'drop'set firewall ipv4 name WAN-TO-LAN default-log
# Log specific rulesset firewall ipv4 name WAN-TO-LAN rule 100 action 'drop'set firewall ipv4 name WAN-TO-LAN rule 100 logset firewall ipv4 name WAN-TO-LAN rule 100 description 'Log and drop invalid'set firewall ipv4 name WAN-TO-LAN rule 100 state 'invalid'
# Log successful SSH (for audit)set firewall ipv4 name LAN-LOCAL rule 50 logset firewall ipv4 name LAN-LOCAL rule 50 description 'Log SSH access'
commitWarning: Don’t log everything at accept. High-traffic rules logging can overwhelm storage and CPU.
Reading Logs
# Recent logsshow log
# Filtered logsshow log | match firewallshow log | match -i error
# Tail logs in real-timemonitor log
# Specific log fileshow log file firewallMetrics and Monitoring
SNMP for Traditional Monitoring
If you have Zabbix, PRTG, LibreNMS, or similar:
configure
# SNMP v2c (simple but less secure)set service snmp community public authorization 'ro'set service snmp community public network '10.0.0.0/24'set service snmp listen-address 10.0.0.1
# SNMP v3 (recommended)set service snmp v3 user monitor auth encrypted-password 'authpassword'set service snmp v3 user monitor auth type 'sha'set service snmp v3 user monitor privacy encrypted-password 'privpassword'set service snmp v3 user monitor privacy type 'aes'set service snmp v3 user monitor group 'monitor-group'
set service snmp v3 group monitor-group mode 'ro'set service snmp v3 group monitor-group view 'monitor-view'set service snmp v3 view monitor-view oid '.1'
commitPrometheus/Node Exporter
For modern monitoring stacks:
# VyOS doesn't have native Prometheus exporter, but you can:# 1. Install node_exporter via container# 2. Use SNMP exporter with Prometheus# 3. Script custom metrics export
# Example: expose metrics via simple script# Create /config/scripts/metrics.shA simple metrics approach:
#!/bin/bash# /config/scripts/metrics.sh - run via cron or http server
echo "# HELP vyos_interface_rx_bytes Interface received bytes"echo "# TYPE vyos_interface_rx_bytes counter"for iface in eth0 eth1 eth2; do rx=$(cat /sys/class/net/$iface/statistics/rx_bytes 2>/dev/null || echo 0) echo "vyos_interface_rx_bytes{interface=\"$iface\"} $rx"done
echo "# HELP vyos_firewall_dropped Firewall dropped packets"echo "# TYPE vyos_firewall_dropped counter"# Parse from iptables -L -v -nHealth Checks
Monitor critical functions:
# Interface statusshow interfaces
# Routing tableshow ip route
# Firewall countersshow firewall
# VPN statusshow vpn ipsec sashow wireguard peers
# System resourcesshow system memoryshow system cpushow system storageAutomate these checks and alert on anomalies.
Configuration Backup
Manual Backup
# Show configuration (can be piped to file)show configuration commands
# Save to fileshow configuration commands > /config/backup/config-$(date +%Y%m%d).txt
# Compare configuration filesdiff /config/backup/config-old.txt /config/backup/config-new.txtAutomated Backup Script
#!/bin/bashBACKUP_DIR="/config/backup"DATE=$(date +%Y%m%d-%H%M)KEEP_DAYS=30
# Create backup/opt/vyatta/bin/vyatta-op-cmd-wrapper show configuration commands > "$BACKUP_DIR/vyos-config-$DATE.txt"
# Clean old backupsfind "$BACKUP_DIR" -name "vyos-config-*.txt" -mtime +$KEEP_DAYS -delete
# Optional: copy to remote server# scp "$BACKUP_DIR/vyos-config-$DATE.txt" backup@server:/backups/vyos/Schedule it:
set system task-scheduler task backup-config executable path '/config/scripts/backup-config.sh'set system task-scheduler task backup-config interval '1d'Remote Backup
Send backups off-device:
#!/bin/bash# Backup to remote server via SCP
REMOTE_USER="backup"REMOTE_HOST="10.0.0.100"REMOTE_PATH="/backups/vyos"
CONFIG_FILE="/tmp/vyos-config-$(date +%Y%m%d).txt"
# Generate config/opt/vyatta/bin/vyatta-op-cmd-wrapper show configuration commands > "$CONFIG_FILE"
# Send to remotescp -i /config/auth/backup_key "$CONFIG_FILE" "${REMOTE_USER}@${REMOTE_HOST}:${REMOTE_PATH}/"
# Cleanuprm "$CONFIG_FILE"Git-Based Configuration Management
For infrastructure-as-code approach:
#!/bin/bashcd /configgit add -Agit commit -m "Config backup $(date +%Y%m%d-%H%M)"git push origin mainInitialize git in /config:
cd /configgit initgit remote add origin git@github.com:yourorg/vyos-config.gitThis gives you:
- Full version history
- Diff between any versions
- Blame to see who changed what
- Rollback to any point
Configuration Diff
Always diff before committing changes:
configure
# Make some changesset interfaces ethernet eth0 description 'NEW-WAN'
# See what would changecompare
# Discard if wrongdiscard
# Or commit if correctcommitFor historical comparison:
# Compare running config with saved boot config (in configure mode)configurecompare savedexit
# Compare two backup filesdiff /config/backup/config-old.txt /config/backup/config-new.txtAlerting
Simple Email Alerts
#!/bin/bashSUBJECT="$1"MESSAGE="$2"RECIPIENT="admin@example.com"
echo "$MESSAGE" | mail -s "$SUBJECT" "$RECIPIENT"Integrate with monitoring:
#!/bin/bashif ! ping -c 3 -W 5 8.8.8.8 > /dev/null 2>&1; then /config/scripts/alert.sh "WAN DOWN" "Primary WAN unreachable at $(date)"fiWebhook Alerts (Slack, Discord, PagerDuty)
#!/bin/bashWEBHOOK_URL="https://hooks.slack.com/services/xxx"MESSAGE="$1"
curl -X POST -H 'Content-type: application/json' \ --data "{\"text\":\"$MESSAGE\"}" \ "$WEBHOOK_URL"What to Monitor
Essential metrics for router health:
| Metric | Warning Threshold | Critical Threshold |
|---|---|---|
| CPU usage | > 70% | > 90% |
| Memory usage | > 80% | > 95% |
| Interface errors | > 0.1% | > 1% |
| Firewall drops/sec | Depends on baseline | Sudden 10x increase |
| BGP peer state | Any change | Down |
| VPN tunnel state | Flapping | Down |
| Disk usage | > 80% | > 95% |
| Config changes | Any | Unexpected |
Disaster Recovery Checklist
When everything fails, you need:
- Configuration backup (tested restore)
- Firmware/image backup (same VyOS version)
- Documented procedure (how to restore)
- Out-of-band access (console, IPMI, if available)
Test Your Backups
# Periodically test restore# On a test instance:configureload /config/backup/vyos-config-latest.txtcompare# Review changescommitA backup you’ve never tested restoring is not a backup.
Complete Observability Setup
# === Syslog ===set system syslog global facility all level 'info'set system syslog host 10.0.0.100 facility all level 'info'set system syslog host 10.0.0.100 protocol 'udp'set system syslog file messages facility all level 'notice'
# === SNMP ===set service snmp community monitoring authorization 'ro'set service snmp community monitoring network '10.0.0.0/24'set service snmp listen-address 10.0.0.1set service snmp location 'Network Closet'set service snmp contact 'admin@example.com'
# === Firewall Logging ===set firewall ipv4 name WAN-TO-LAN default-logset firewall ipv4 name WAN-LOCAL default-log
# === Scheduled Tasks ===set system task-scheduler task backup-config executable path '/config/scripts/backup-config.sh'set system task-scheduler task backup-config interval '1d'set system task-scheduler task wan-monitor executable path '/config/scripts/wan-monitor.sh'set system task-scheduler task wan-monitor interval '5m'The Lesson
A router without observability is like running production without monitoring — you’ll only know something’s wrong when users complain, and you’ll have no data to diagnose it.
The minimum viable observability:
- Remote syslog: Logs survive device failure
- Firewall logging: See what’s being blocked
- Configuration backups: Automated, tested, off-device
- Health monitoring: Alert before users notice
Everything else builds on this foundation. Start simple, add complexity as needed. The goal isn’t comprehensive monitoring — it’s having the data you need when things break.