Linux nftables Deep Dive: Modern Stateful Firewalls

If you are still writing iptables rules on a modern Linux host, you are using a compatibility shim — on most distributions iptables is now a translation layer over the nftables kernel subsystem. You may as well write nftables directly, and once you see how sets and maps collapse a hundred iptables lines into five, you will not go back.

nftables is not “iptables with new syntax.” It is a different model: one framework for IPv4, IPv6, ARP, and bridge; rules that match multiple things at once; and native data structures the old tools never had.

The Model: Tables, Chains, Hooks

  • A table is a namespace, bound to a family (inet covers IPv4+IPv6 together — use it).
  • A chain holds rules. A base chain attaches to a netfilter hook with a priority; a regular chain is just a jump target.
  • The hook and priority decide when the chain runs relative to routing and other subsystems.

A minimal host firewall:

Terminal window
table inet filter {
chain input {
type filter hook input priority 0; policy drop;
ct state established,related accept
ct state invalid drop
iif lo accept
ip protocol icmp accept
ip6 nexthdr icmpv6 accept
tcp dport { 22, 80, 443 } accept
counter comment "dropped by default policy"
}
chain forward { type filter hook forward priority 0; policy drop; }
chain output { type filter hook output priority 0; policy accept; }
}

Two things to notice immediately: inet handles v4 and v6 in one place, and tcp dport { 22, 80, 443 } is an anonymous set — one rule, three ports, matched in O(1), not three sequential rules.

Stateful by Connection Tracking

ct state established,related accept at the top is the workhorse. The conntrack subsystem tracks flows, so you allow new connections explicitly and let replies through automatically. ct state invalid drop discards packets that do not belong to any tracked flow — malformed or out-of-window junk.

You can match far more than state — the original direction of a flow, the conntrack mark, NAT status:

Terminal window
ct status dnat accept # accept things you DNAT'd
ct mark 0x1 accept

Sets and Maps: The Real Upgrade

This is where nftables leaves iptables behind. A named set is a first-class object you can update without touching rules:

Terminal window
table inet filter {
set blocklist {
type ipv4_addr
flags interval
elements = { 192.0.2.0/24, 198.51.100.7 }
}
set allowed_ssh {
type ipv4_addr
elements = { 10.0.0.0/24, 203.0.113.10 }
}
chain input {
type filter hook input priority 0; policy drop;
ct state established,related accept
ip saddr @blocklist drop
tcp dport 22 ip saddr @allowed_ssh accept
}
}

Update the blocklist at runtime without reloading the ruleset — exactly what a fail2ban-style daemon wants:

Terminal window
nft add element inet filter blocklist { 203.0.113.66 }
nft delete element inet filter blocklist { 198.51.100.7 }

A map goes further — it associates keys with values, replacing whole chains of conditional logic. Port-based dispatch in one verdict map:

Terminal window
chain input {
type filter hook input priority 0; policy drop;
ct state established,related accept
tcp dport vmap { 22 : accept, 80 : accept, 443 : accept, 3306 : drop }
}

NAT with a map — destination depends on the incoming port, no rule-per-service:

Terminal window
table inet nat {
chain prerouting {
type nat hook prerouting priority dstnat;
dnat to tcp dport map { 8080 : 10.0.0.10, 8443 : 10.0.0.11 }
}
}

Atomic Reloads

A classic iptables footgun: flush the rules, then load new ones, and in the gap between the two the firewall is wide open (or fully closed). nftables loads a whole ruleset file in a single atomic transaction — it either all applies or none does:

Terminal window
nft -f /etc/nftables.conf # atomic: no open window, no half-applied state
nft -c -f /etc/nftables.conf # -c = check syntax only, change nothing

Always run -c in your config pipeline before applying. A syntax error caught by -c is a failed CI job; the same error during a flush-and-reload could leave the host unreachable.

Inspecting What It Does

Terminal window
# Full ruleset with rule handles (needed to delete specific rules)
nft -a list ruleset
# Just one table
nft list table inet filter
# Watch counters to see what's actually matching
nft list chain inet filter input
# Live trace of packets through the ruleset — the killer debug tool
nft monitor trace

nft monitor trace (paired with a meta nftrace set 1 rule for the traffic you care about) shows a packet’s path through every chain and rule. It is the single best reason to use nftables natively — nothing in the iptables world matches it.

Priorities and Hook Ordering

The number after priority decides ordering when multiple base chains attach to the same hook. Lower runs first. nftables ships named aliases for the values netfilter has always used, and they matter the moment you mix filtering with NAT or do policy routing:

AliasValueHook context
raw-300before conntrack
mangle-150packet mangling
dstnat-100prerouting DNAT
filter0the default filtering point
srcnat100postrouting SNAT

A base chain at priority raw runs before connection tracking is established — that is where you put notrack rules for high-volume traffic you never want conntrack to spend memory on:

Terminal window
table inet raw {
chain prerouting {
type filter hook prerouting priority raw;
udp dport 53 notrack
}
}

Two base chains on the same hook with the same priority is undefined ordering — give them distinct numbers. And NAT only happens on the first packet of a flow; conntrack replays the translation for the rest, so a DNAT rule that sits behind a ct state established,related accept shortcut still works, because the established packets never reach the nat chain in the first place.

Verifying and Troubleshooting

Counters tell you whether a rule fires; the trace tells you why a packet ended up where it did. Add a named counter you can read by name:

Terminal window
table inet filter {
counter ssh_accepts { }
chain input {
type filter hook input priority 0; policy drop;
ct state established,related accept
tcp dport 22 ip saddr @allowed_ssh counter name ssh_accepts accept
}
}
Terminal window
nft list counter inet filter ssh_accepts
# counter ssh_accepts {
# packets 142 bytes 9376
# }

For a flow that is being dropped and you cannot see where, mark it for tracing and watch:

Terminal window
nft add rule inet filter input ip saddr 203.0.113.5 meta nftrace set 1
nft monitor trace
# trace id 3a1f inet filter input packet: iif "eth0" ip saddr 203.0.113.5 ...
# trace id 3a1f inet filter input rule ct state invalid drop (verdict drop)

The trace names the exact rule and verdict. A common gotcha it exposes: a packet hitting ct state invalid drop because the host saw the reply but never the SYN — asymmetric routing, not a firewall mistake. Another: rules ordered so that a broad accept shadows a later drop you expected to match. The trace shows the first terminal verdict and stops, which is precisely the rule you need to move.

When the ruleset misbehaves after an edit, dump it with handles and delete surgically rather than reloading the whole file:

Terminal window
nft -a list chain inet filter input
# tcp dport 22 ip saddr @allowed_ssh accept # handle 7
nft delete rule inet filter input handle 7

Rate Limiting Without a Sidecar Daemon

nftables can do the fail2ban job itself with a dynamic set whose elements expire, plus the limit rate object. A set keyed on source address with a per-element timeout becomes a self-cleaning offender list:

Terminal window
table inet filter {
set flood {
type ipv4_addr
flags dynamic, timeout
timeout 1m
}
chain input {
type filter hook input priority 0; policy drop;
ct state established,related accept
# More than 10 new SSH conns/min from one source -> add to set, drop
tcp dport 22 ct state new \
add @flood { ip saddr limit rate over 10/minute } drop
tcp dport 22 ct state new accept
}
}

The limit rate over 10/minute matches only when the source exceeds the rate, at which point add @flood { ip saddr ... } inserts it with the set’s 1-minute timeout and the packet is dropped. No external daemon, no reload — the kernel ages entries out on its own. Inspect the live offender list with its remaining timers:

Terminal window
nft list set inet filter flood
# set flood {
# type ipv4_addr
# elements = { 203.0.113.66 expires 47s }
# }

This is the pattern that makes named sets the headline feature: the data plane mutates its own state at packet rate, and the ruleset stays static.

Migrating Without a Big Bang

You do not have to convert everything at once. iptables-translate converts individual rules so you can learn the mapping:

Terminal window
iptables-translate -A INPUT -p tcp --dport 22 -j ACCEPT
# nft add rule ip filter INPUT tcp dport 22 counter accept

Translate your existing ruleset, read it, then rewrite it properly using sets and maps — the mechanical translation works but misses the whole point. The value of nftables is not 1:1 rule parity; it is that a firewall built around named sets and verdict maps is shorter, faster, and actually readable six months later when you have to change it.