Proxmox SDN: Zones, VNets, and EVPN in the Hypervisor

The traditional Proxmox network is a pile of vmbrN bridges defined per node, VLAN tags typed into each VM’s NIC, and a prayer that node 3 was configured the same as node 1. It works at three nodes and falls apart at thirty. SDN replaces it with a model you define once at the datacenter level and Proxmox pushes to every node — and at the high end, it gives you EVPN overlays so VMs can live on routed segments that span the whole cluster.

This is the piece of Proxmox most people never turn on, and it is the one that makes the cluster’s networking actually manageable.

The Three Objects

SDN is built from a small hierarchy, all defined cluster-wide under Datacenter → SDN:

  • Zone — the transport mechanism. How traffic moves between nodes: plain bridge (Simple), VLAN, QinQ, VXLAN, or EVPN.
  • VNet — a virtual network inside a zone. This is what a VM actually attaches to, replacing the manual bridge+tag.
  • Subnet — IP configuration attached to a VNet: CIDR, gateway, optional DHCP/IPAM.

You attach VMs to VNets. The zone decides how that VNet is realized across nodes. Change the zone type and the same VNet abstraction is carried by VLANs or VXLAN without touching the VMs.

Zone Types, Shortest Useful Description

ZoneWhat it doesWhen
SimpleLocal bridge + IPAM, no cross-node L2Isolated per-node nets
VLANTags VNets onto an existing trunkYou already run VLANs
QinQStacked VLAN tags (S-VLAN/C-VLAN)Service-provider style isolation
VXLANL2 over L3 between nodes, no routingStretch L2 without a router
EVPNVXLAN + BGP control plane + routingRouted overlays, the real SDN

A VLAN Zone (the common starting point)

If you already have a VLAN trunk to each node, this replaces per-VM tagging. Define the zone over the trunk bridge, then VNets carry tags:

/etc/pve/sdn/zones.cfg
vlan: vlanzone
bridge vmbr0
ipam pve
# /etc/pve/sdn/vnets.cfg
vnet: web
zone vlanzone
tag 20
vnet: db
zone vlanzone
tag 30

VMs attach to VNet web instead of “vmbr0 with tag 20 typed in by hand.” Add a node to the cluster and the VNets are already defined for it — no per-node bridge surgery.

Nothing is live until you apply, which generates the actual interface config on every node:

Terminal window
# After editing in the GUI or the cfg files
pvesh set /cluster/sdn # apply pending SDN config cluster-wide
# or from the GUI: Datacenter -> SDN -> Apply

The EVPN Zone: Routed Overlays

This is where Proxmox SDN earns the name. An EVPN zone runs a BGP EVPN control plane (via FRR, which Proxmox manages for you) so VNets become VXLAN segments with a distributed anycast gateway — a VM keeps its gateway after migrating to any node.

First a controller defines the BGP fabric — the Proxmox nodes as leaves talking to your spines/route-reflectors:

/etc/pve/sdn/controllers.cfg
evpn: evpnctl
asn 65000
peers 10.0.0.1,10.0.0.2 # your route reflectors / spines

Then the EVPN zone references it, with a VRF VXLAN tag for the L3 routing:

/etc/pve/sdn/zones.cfg
evpn: evpnzone
controller evpnctl
vrf-vxlan 10000
mtu 1450
ipam pve
exitnodes node1,node2 # nodes that route overlay <-> outside

VNets in the zone get their own VXLAN tag and a subnet with a gateway — the gateway is anycast across every node:

/etc/pve/sdn/vnets.cfg
vnet: tenant-a-web
zone evpnzone
tag 10010
# /etc/pve/sdn/subnets.cfg
subnet: evpnzone-10.50.10.0-24
vnet tenant-a-web
gateway 10.50.10.1
snat 0

Apply, and Proxmox configures FRR on every node, brings up the VXLAN interfaces, and advertises the VM routes via EVPN Type-2/Type-5 — the same machinery as a hardware fabric, running inside the hypervisors. Exit nodes are where the overlay meets the physical network; traffic leaving the tenant subnet routes out through them.

Verifying

Terminal window
# Did the apply succeed with no pending/errors?
pvesh get /cluster/sdn
# The generated interfaces on a node
ip -br link | grep -E 'vnet|vxlan|vrf'
# EVPN control plane — Proxmox runs FRR, so use vtysh
vtysh -c "show bgp l2vpn evpn summary"
vtysh -c "show evpn vni"
vtysh -c "show evpn mac vni all"

The most common failure is the underlay: the EVPN peers must be reachable from every node’s loopback before EVPN comes up. If show bgp l2vpn evpn summary shows sessions stuck in Active/Connect, the BGP underlay to your spines is the problem, not SDN.

When the Overlay Goes Dark: Troubleshooting EVPN

EVPN failures are almost never the SDN config — they are the underlay or the FRR state underneath it. Work bottom-up.

First, is the BGP EVPN session actually up? A node that cannot reach its peers’ loopbacks shows sessions stuck before Established:

Terminal window
vtysh -c "show bgp l2vpn evpn summary"
# State/PfxRcd of "Active" or "Connect" = no session.
# A number (received prefixes) = the session is up and exchanging routes.

If sessions are up but a VM cannot reach another node’s VM on the same VNet, check that the VNI is learned and the remote MAC is present:

Terminal window
# Is the VNI up, and is it type L2 with the right VXLAN id?
vtysh -c "show evpn vni detail"
# Is the remote VM's MAC learned via EVPN (Type-2) on this VNI?
vtysh -c "show evpn mac vni 10010"
# Remote MACs should show a remote VTEP IP as their next-hop.

A MAC that never appears remotely means Type-2 routes are not propagating — usually a route-target mismatch or an MTU problem swallowing the larger VXLAN-encapsulated frames. The mtu 1450 in the zone is the inner MTU; the physical underlay must carry at least 50 bytes more (the VXLAN header). If your underlay links are 1500, the overlay must stay at 1450 or you get silent path-MTU blackholes — large flows hang while pings work.

For routed traffic leaving the tenant subnet, confirm the Type-5 (prefix) routes and that the exit nodes are advertising them:

Terminal window
vtysh -c "show bgp l2vpn evpn route type prefix"
ip -br link | grep -E 'vrf|vxlan' # the VRF and per-VNI vxlan devices exist

A VLAN-Aware Migration, Without the Outage

Moving an existing hand-built VLAN setup onto SDN is the common real-world job, and the trap is renumbering everything at once. Do it VNet by VNet. The SDN VLAN zone tags onto the same trunk your vmbrN already uses, so a VNet with tag 20 is electrically identical to a VM port set to “vmbr0, VLAN 20” — the wire does not know the difference.

/etc/pve/sdn/zones.cfg
vlan: trunkzone
bridge vmbr0
# /etc/pve/sdn/vnets.cfg — mirror an existing in-use VLAN first
vnet: legacy-vlan20
zone trunkzone
tag 20

Apply, then move VMs one at a time from the manual vmbr0/tag 20 NIC to the legacy-vlan20 VNet. Each VM keeps its IP and gateway; you are only changing how the tag is administered, not the L2 domain. Validate connectivity per VM, and roll back a single NIC if anything is off, rather than reverting the whole cluster.

Terminal window
# Confirm the apply produced no pending state before moving VMs
pvesh get /cluster/sdn
ip -br link | grep -E 'vnet20|fwln' # the bridge/firewall plumbing for the VNet

Where to Stop

Not everyone needs EVPN. The decision is honest:

  • A few nodes, flat networks → Simple or VLAN zone. SDN still helps by making the config cluster-wide and declarative.
  • Stretch one or two L2 segments across nodes without a router → VXLAN.
  • Multi-tenant, routed overlays, gateway-follows-the-VM, integration with a hardware EVPN fabric → EVPN.

The trap is reaching for EVPN because it is the most powerful, then operating a BGP fabric you did not need. Use the simplest zone that solves the problem. But whichever zone you pick, define it in SDN rather than hand-editing /etc/network/interfaces per node — the whole point is that the cluster’s network is one declarative model, applied everywhere, instead of thirty bridges you hope still match.