At an internet exchange, every member could peer with every other member directly. With 200 members that is 199 BGP sessions each and a configuration nightmare. The route server solves it: each member peers once, with the route server, and gets routes from everyone else. It is the piece of infrastructure that makes an IXP scale.
The subtle part is that a route server is not a normal BGP router. It must not insert itself into the path, must not change next-hops, and must keep each member’s routes logically separate. BIRD handles all of this, but only if you configure it deliberately.
What Makes a Route Server Different
A transit router puts its own AS in the path and rewrites the next-hop to itself. A route server does neither:
- Transparent AS_PATH — the route server’s AS does not appear, so traffic flows directly member-to-member, not through the RS.
- Next-hop unchanged — the next-hop stays the advertising member’s IP on the peering LAN.
- Per-client RIB — each member’s best-path selection is computed independently, so member A can be sent a different route than member B for the same prefix (because A and B have different export policies).
BIRD’s rs client option turns all three on.
Base Configuration
BIRD 2 unifies IPv4/IPv6 into one daemon. Define the router id and the RPKI-backed ROA tables first:
router id 192.0.2.10;
roa4 table r4;roa6 table r6;
protocol rpki validator { roa4 { table r4; }; roa6 { table r6; }; remote "127.0.0.1" port 3323; # local RPKI validator over RTR retry keep 30;}The Filtering Functions
Every route from a member is filtered before it enters the table and before it leaves to another member. The filters do four jobs: drop bogons, enforce a sane prefix length, check RPKI, and honor member control communities.
function is_bogon_v4() { return net ~ [ 0.0.0.0/8+, 10.0.0.0/8+, 100.64.0.0/10+, 127.0.0.0/8+, 169.254.0.0/16+, 172.16.0.0/12+, 192.0.2.0/24+, 192.168.0.0/16+, 198.18.0.0/15+, 224.0.0.0/3+ ];}
function reasonable_len_v4() { return net.len >= 8 && net.len <= 24;}
function rpki_ok() { case roa_check(r4, net, bgp_path.last_nonaggregated) { ROA_VALID: return true; ROA_UNKNOWN: return true; # accept unknown, reject only invalid ROA_INVALID: return false; }}Rejecting RPKI-invalid and accepting unknown is the standard posture: you drop demonstrably wrong origins without blackholing the large share of space not yet covered by ROAs.
Member Control Communities
Members expect to steer their announcements: “send this prefix to everyone except AS64502,” or “do not announce this at all.” This is implemented with well-known IXP communities, where 0:peer-as means do not export to that AS and rs-as:peer-as means export only to that AS.
define RS_ASN = 65500;
function honor_control(int peer_as) { # 0:0 = announce to none if (65500, 0, 0) ~ bgp_large_community then return false; # 0:peer = do not announce to this peer if (65500, 0, peer_as) ~ bgp_large_community then return false; # if any selective-announce community is set, export only to listed peers if (65500, 1, *) ~ bgp_large_community then { if (65500, 1, peer_as) !~ bgp_large_community then return false; } return true;}A Member Session
Each member is an rs client. The import filter validates; the export filter applies control communities for that member’s AS:
protocol bgp member_64500 { local 192.0.2.10 as RS_ASN; neighbor 192.0.2.1 as 64500; rs client; ipv4 { import filter { if is_bogon_v4() then reject; if !reasonable_len_v4() then reject; if !rpki_ok() then reject; # strip any control communities the member shouldn't set inbound accept; }; export filter { if !honor_control(64500) then reject; accept; }; import limit 200000 action restart; };}import limit is not optional. A member that fat-fingers a redistribute and leaks the full table should hit a ceiling and have its session restarted, not flood every other member.
Verification
# Sessions and prefix counts per memberbirdc show protocols
# RPKI session to the validator established?birdc show protocols all rpki
# What did a specific member send, and did it pass filters?birdc show route protocol member_64500birdc show route filtered protocol member_64500
# Confirm AS_PATH does NOT contain the RS ASN on a received routebirdc show route 198.51.100.0/24 all# bgp_path should be just the origin member's AS, no 65500That last check is the one that proves the route server is transparent. If 65500 shows up in the AS_PATH, rs client is missing somewhere and you have turned your route server into an accidental transit hop.
Operational Drills
| Test | Expected |
|---|---|
| Member announces a bogon | Rejected, visible in show route filtered |
| Member announces RPKI-invalid origin | Rejected |
| Member tags 0:0 on a prefix | Prefix announced to nobody |
| Member exceeds import limit | Session restarts, others unaffected |
| Validator (Routinator) restarts | RPKI session re-establishes, ROAs reload |
Next-Hop and the Third-Party Trap
The rule that a route server never rewrites the next-hop has a sharp edge: BIRD’s default gateway behavior can still recursively resolve a next-hop that is not directly reachable on the peering LAN, and a misbehaving member can advertise a next-hop pointing at another member’s IP. That is third-party next-hop, and on a shared IXP fabric it lets one member silently redirect traffic destined for them through someone else.
Pin the next-hop check explicitly rather than trusting defaults:
protocol bgp member_64500 { local 192.0.2.10 as RS_ASN; neighbor 192.0.2.1 as 64500; rs client; ipv4 { next hop keep; # preserve the member's next-hop, don't rewrite to self import filter { # reject any route whose next-hop is not the peer's own address if from != bgp_next_hop then reject; if is_bogon_v4() then reject; if !reasonable_len_v4() then reject; if !rpki_ok() then reject; accept; }; export filter { if !honor_control(64500) then reject; accept; }; import limit 200000 action restart; };}if from != bgp_next_hop then reject is the guard: from is the session’s peer address, bgp_next_hop is what the route claims. If a member announces a prefix with someone else’s next-hop, the route is dropped before it can poison the table. On IXPs this single line closes a real attack surface.
Scaling: Sessions, RAM, and Split Daemons
A 500-member exchange means 500 sessions and potentially several million paths once you count every member’s full prefix set times the per-client export computation. Two operational facts decide whether BIRD copes.
First, run IPv4 and IPv6 in separate BIRD instances, not one daemon with both channels. BIRD is single-threaded per process; splitting v4 and v6 gives you two cores’ worth of convergence and means an IPv6 reconfigure does not stall v4 sessions.
# Separate config + control socket per address familybird -c /etc/bird/bird4.conf -s /run/bird/bird4.ctlbird -c /etc/bird/bird6.conf -s /run/bird/bird6.ctl
birdc -s /run/bird/bird4.ctl show memory# BGP attributes / route tables dominate; watch "Total" against box RAMSecond, enabling interpret communities off and keeping per-client filters lean matters at scale — every export filter runs per prefix per client. Watch reconfigure time, because that is when a route server hurts:
birdc -s /run/bird/bird4.ctl configure# "Reconfigured" should return in well under a second# If it takes seconds, your filters are doing too much per route — precompute with prefix sets
birdc -s /run/bird/bird4.ctl show protocols | grep -c BGP # session count sanityA route server that takes ten seconds to reconfigure is a route server you become afraid to touch during business hours, which is how stale filters accumulate. Keep the per-route work minimal and add members in batches.
The Ticket You Will Get Most Often
“My prefix is not showing up” lands in the queue daily, and 90% of the time the answer is in show route filtered — but the other 10% is the export side, and members never think to check there. A prefix can pass every import filter, sit healthy in the route server’s table, and still not reach a given member because that member’s export filter dropped it. The common cause is a control community the announcing member set without realizing its reach.
Trace it from both ends:
# Did it pass import? (the announcing member's session)birdc show route filtered protocol member_64500# absent here = it was accepted on import, look at export next
# Is it in the master table at all?birdc show route 198.51.100.0/24 all# check the large communities attached — (65500, 0, X) or (65500, 1, X)?
# Would it be exported toward the complaining member's session?birdc show route 198.51.100.0/24 export member_64502# empty = honor_control() rejected it for AS64502 specificallyshow route <prefix> export <protocol> is the command that ends the argument: it runs the prefix through that member’s export filter and shows exactly what they would receive. If it is empty while the master table has the route, the announcing member tagged a 0:peer-as or selective-announce community that excludes the complainant. Point them at their own community, not at your route server.
What I Run Beside It
A route server is not a replacement for members building proper bilateral sessions for their most important traffic — it is the easy 95%. Pair BIRD with a looking glass and per-member traffic stats, and publish your filtering policy so members know exactly why a prefix was dropped. The single biggest source of IXP support tickets is “why isn’t my prefix showing up,” and birdc show route filtered answers it in one line.