RPKI origin validation is only as good as the validator behind it. Configure invalid reject on a router pointed at a validator that crashed last week, and you are now dropping prefixes based on stale data — or worse, treating everything as unknown and validating nothing. The router config is the easy half. The half that needs care is running the validator and rolling out enforcement without cutting off reachability.
The Pieces
RPKI repositories (RIRs) │ rsync / RRDP ▼ [ Routinator ] validates, builds VRPs │ RTR (TCP 3323) ▼ [ your routers ] roa_check on every BGP pathRoutinator fetches and cryptographically validates ROAs from the five RIR trust anchors, producing a set of Validated ROA Payloads (VRPs). Routers pull those over the RTR protocol and check each BGP route’s origin against them.
Running Routinator
Initialize the trust anchor locators and accept the ARIN TAL terms, then start the server with RTR enabled:
routinator init --accept-arin-rparoutinator server \ --rtr 0.0.0.0:3323 \ --http 127.0.0.1:8323 \ --refresh 600--refresh 600 re-validates every 10 minutes. The HTTP endpoint gives you metrics and a status page; keep it bound to localhost or behind auth.
Before pointing routers at it, confirm it actually built VRPs:
# Count of valid ROA payloads — should be hundreds of thousandsroutinator vrps | wc -l
# Check a known prefix/originroutinator validate --asn AS15169 --prefix 8.8.8.0/24# -> ValidIf vrps returns nothing, the initial fetch has not completed or rsync/RRDP is blocked outbound. Do not connect routers to an empty validator — every route would be unknown and you would learn nothing.
Run two validators on separate hosts. A router with a single RTR source that goes away falls back to treating routes as unknown, silently disabling validation. Two sources means one can fail without changing forwarding.
Feeding the Routers
BIRD 2 — an RPKI protocol populating ROA tables, then roa_check in the filter:
roa4 table r4;roa6 table r6;
protocol rpki validator1 { roa4 { table r4; }; roa6 { table r6; }; remote "10.0.0.5" port 3323; retry keep 90; refresh keep 300;}
filter bgp_in { if net.type = NET_IP4 && roa_check(r4, net, bgp_path.last_nonaggregated) = ROA_INVALID then reject; if net.type = NET_IP6 && roa_check(r6, net, bgp_path.last_nonaggregated) = ROA_INVALID then reject; accept;}FRR (bgpd):
rpki rpki cache tcp 10.0.0.5 3323 preference 1 rpki cache tcp 10.0.0.6 3323 preference 2 rpki polling_period 300!! drop invalids, accept the restroute-map RPKI-IN deny 10 match rpki invalidroute-map RPKI-IN permit 20Cisco IOS-XR:
router bgp 64500 rpki server 10.0.0.5 transport tcp port 3323 refresh-time 300 ! address-family ipv4 unicast bgp origin-as validation enable bgp bestpath origin-as use validityVerify the RTR Session and Data
# BIRDbirdc show protocols all validator1# State: Established; should report a count of imported ROAs
# FRRvtysh -c "show rpki cache-connection"vtysh -c "show rpki prefix 8.8.8.0/24"
# IOS-XRshow bgp rpki summaryshow bgp ipv4 unicast origin-as validityOn IOS-XR confirm the session state and the validity breakdown it learned:
# RTR session and downloaded record countshow bgp rpki summary# Session State: ESTAB; ROAs IPv4: 412033
# Distribution of validity states across the tableshow bgp ipv4 unicast origin-as validityA healthy validator feeds hundreds of thousands of VRPs. If show bgp rpki summary reports an established session but only a handful of ROAs, the validator is still building its initial set — wait for it to finish before enforcing anything.
Rolling Out “Invalid Reject” Safely
Do not flip reject on a production edge in one change. Stage it:
- Observe. Enable validation but take no action — just mark routes. Count how many of your received prefixes are invalid and which are yours.
# FRR: see what would be dropped before dropping itvtysh -c "show bgp ipv4 unicast rpki invalid" | head-
Fix your own. The most common shock is finding your own prefixes marked invalid because a ROA has the wrong max-length or a stale origin AS. Fix ROAs before you start rejecting, or peers running RPKI will drop you.
-
De-pref, then drop. First set invalids to local-pref 0 (still reachable if no alternative), watch for a week, then move to outright reject.
-
Roll per-neighbor. Apply reject to one upstream, confirm reachability, then expand.
Validator Comparison and Placement
Routinator is not the only validator, and the choice affects how you run the RTR layer.
| Validator | Language | Built-in RTR server | Notes |
|---|---|---|---|
| Routinator | Rust | Yes | Single binary, low memory, easy to operate |
OpenBGPD rpki-client | C | No (pairs with stayrtr) | Validation only; feeds JSON to a separate RTR daemon |
| FORT | C | Yes | Heavier, more moving parts |
A common production layout decouples validation from RTR: run rpki-client on a cron to produce a signed VRP JSON file, then serve it with stayrtr. That lets several stayrtr instances fan out one validation result without each re-fetching the global RPKI tree:
# rpki-client writes a file named "json" into the OUTPUT dir (positional arg)rpki-client -j /var/lib/rpki-client
# stayrtr serves that file over RTR to the routersstayrtr -cache /var/lib/rpki-client/json -bind :3323With Routinator the same fan-out is achieved by pointing all routers at two Routinator hosts directly — simpler, and what most networks should start with. Place the validators close to the routers (same management network or region) so an RTR session reset re-syncs the full VRP set quickly; a cold sync of several hundred thousand VRPs over a congested link delays convergence.
Drill: Stale Validator, Healthy RTR
The scenario worth rehearsing in a lab before it bites you in production. Block outbound rsync/RRDP on the validator host while leaving RTR up, then watch what the router does:
# On the validator host, simulate repository unreachabilityiptables -A OUTPUT -p tcp --dport 443 -j DROPiptables -A OUTPUT -p tcp --dport 873 -j DROP
# The RTR session stays Established — the router sees no problemvtysh -c "show rpki cache-connection"# connect status: connected
# But the data freezes. Confirm on the validator:curl -s http://127.0.0.1:8323/metrics | grep -E 'routinator_last_update_(start|done|duration)'# routinator_last_update_done keeps climbing — it is seconds since the last successful updateA freshly-issued ROA for a prefix you receive will now be absent from the frozen VRP set. If that prefix was previously unknown and the new ROA makes it valid, nothing breaks. The danger is the reverse: a ROA change that should make a route valid never propagates, and a route-map keyed on rpki invalid is fine — but a route that goes valid -> invalid upstream (ROA revoked) will keep being accepted because your validator never learned of the revocation. Either way the lesson holds: alert on update age, not just session state.
# Restoreiptables -D OUTPUT -p tcp --dport 443 -j DROPiptables -D OUTPUT -p tcp --dport 873 -j DROPThe Failure Mode to Rehearse
The dangerous state is not “validator down” — routers fall back to unknown and keep forwarding. The dangerous state is validator serving stale or partial data while looking healthy. Routinator’s metrics expose the last successful update time:
curl -s http://127.0.0.1:8323/metrics | grep routinator_last_updateAlert on that timestamp going stale. A validator that has not refreshed in hours but still answers RTR will happily tell your routers a freshly-issued ROA does not exist, and you will drop a legitimate prefix with no obvious cause.
Origin validation is one of the highest-value, lowest-glory things you can run. It quietly stops a class of route hijacks — but only if the data behind it is fresh, redundant, and monitored. The router commands are five minutes of work; the validator is the part you actually operate.