Linux nftables Deep Dive: Modern Stateful Firewalls

May 1, 2026 · 9 min read

If you are still writing iptables rules on a modern Linux host, you are using a compatibility shim — on most distributions iptables is now a translation layer over the nftables kernel subsystem. You may as well write nftables directly, and once you see how sets and maps collapse a hundred iptables lines into five, you will not go back.

nftables is not “iptables with new syntax.” It is a different model: one framework for IPv4, IPv6, ARP, and bridge; rules that match multiple things at once; and native data structures the old tools never had.

The Model: Tables, Chains, Hooks

A table is a namespace, bound to a family (inet covers IPv4+IPv6 together — use it).
A chain holds rules. A base chain attaches to a netfilter hook with a priority; a regular chain is just a jump target.
The hook and priority decide when the chain runs relative to routing and other subsystems.

A minimal host firewall:

table inet filter {
    chain input {
        type filter hook input priority 0; policy drop;

        ct state established,related accept
        ct state invalid drop
        iif lo accept
        ip protocol icmp accept
        ip6 nexthdr icmpv6 accept

        tcp dport { 22, 80, 443 } accept
        counter comment "dropped by default policy"
    }
    chain forward { type filter hook forward priority 0; policy drop; }
    chain output  { type filter hook output  priority 0; policy accept; }
}

Two things to notice immediately: inet handles v4 and v6 in one place, and tcp dport { 22, 80, 443 } is an anonymous set — one rule, three ports, matched in O(1), not three sequential rules.

Stateful by Connection Tracking

ct state established,related accept at the top is the workhorse. The conntrack subsystem tracks flows, so you allow new connections explicitly and let replies through automatically. ct state invalid drop discards packets that do not belong to any tracked flow — malformed or out-of-window junk.

You can match far more than state — the original direction of a flow, the conntrack mark, NAT status:

ct status dnat accept                       # accept things you DNAT'd
ct mark 0x1 accept

Sets and Maps: The Real Upgrade

This is where nftables leaves iptables behind. A named set is a first-class object you can update without touching rules:

table inet filter {
    set blocklist {
        type ipv4_addr
        flags interval
        elements = { 192.0.2.0/24, 198.51.100.7 }
    }
    set allowed_ssh {
        type ipv4_addr
        elements = { 10.0.0.0/24, 203.0.113.10 }
    }
    chain input {
        type filter hook input priority 0; policy drop;
        ct state established,related accept
        ip saddr @blocklist drop
        tcp dport 22 ip saddr @allowed_ssh accept
    }
}

Update the blocklist at runtime without reloading the ruleset — exactly what a fail2ban-style daemon wants:

nft add element inet filter blocklist { 203.0.113.66 }
nft delete element inet filter blocklist { 198.51.100.7 }

A map goes further — it associates keys with values, replacing whole chains of conditional logic. Port-based dispatch in one verdict map:

chain input {
    type filter hook input priority 0; policy drop;
    ct state established,related accept
    tcp dport vmap { 22 : accept, 80 : accept, 443 : accept, 3306 : drop }
}

NAT with a map — destination depends on the incoming port, no rule-per-service:

table inet nat {
    chain prerouting {
        type nat hook prerouting priority dstnat;
        dnat to tcp dport map { 8080 : 10.0.0.10, 8443 : 10.0.0.11 }
    }
}

Atomic Reloads

A classic iptables footgun: flush the rules, then load new ones, and in the gap between the two the firewall is wide open (or fully closed). nftables loads a whole ruleset file in a single atomic transaction — it either all applies or none does:

nft -f /etc/nftables.conf      # atomic: no open window, no half-applied state
nft -c -f /etc/nftables.conf   # -c = check syntax only, change nothing

Always run -c in your config pipeline before applying. A syntax error caught by -c is a failed CI job; the same error during a flush-and-reload could leave the host unreachable.

Inspecting What It Does

# Full ruleset with rule handles (needed to delete specific rules)
nft -a list ruleset

# Just one table
nft list table inet filter

# Watch counters to see what's actually matching
nft list chain inet filter input

# Live trace of packets through the ruleset — the killer debug tool
nft monitor trace

nft monitor trace (paired with a meta nftrace set 1 rule for the traffic you care about) shows a packet’s path through every chain and rule. It is the single best reason to use nftables natively — nothing in the iptables world matches it.

Priorities and Hook Ordering

The number after priority decides ordering when multiple base chains attach to the same hook. Lower runs first. nftables ships named aliases for the values netfilter has always used, and they matter the moment you mix filtering with NAT or do policy routing:

Alias	Value	Hook context
`raw`	-300	before conntrack
`mangle`	-150	packet mangling
`dstnat`	-100	prerouting DNAT
`filter`	0	the default filtering point
`srcnat`	100	postrouting SNAT

A base chain at priority raw runs before connection tracking is established — that is where you put notrack rules for high-volume traffic you never want conntrack to spend memory on:

table inet raw {
    chain prerouting {
        type filter hook prerouting priority raw;
        udp dport 53 notrack
    }
}

Two base chains on the same hook with the same priority is undefined ordering — give them distinct numbers. And NAT only happens on the first packet of a flow; conntrack replays the translation for the rest, so a DNAT rule that sits behind a ct state established,related accept shortcut still works, because the established packets never reach the nat chain in the first place.

Verifying and Troubleshooting

Counters tell you whether a rule fires; the trace tells you why a packet ended up where it did. Add a named counter you can read by name:

table inet filter {
    counter ssh_accepts { }
    chain input {
        type filter hook input priority 0; policy drop;
        ct state established,related accept
        tcp dport 22 ip saddr @allowed_ssh counter name ssh_accepts accept
    }
}

nft list counter inet filter ssh_accepts
# counter ssh_accepts {
#     packets 142 bytes 9376
# }

For a flow that is being dropped and you cannot see where, mark it for tracing and watch:

nft add rule inet filter input ip saddr 203.0.113.5 meta nftrace set 1
nft monitor trace
# trace id 3a1f inet filter input packet: iif "eth0" ip saddr 203.0.113.5 ...
# trace id 3a1f inet filter input rule ct state invalid drop (verdict drop)

The trace names the exact rule and verdict. A common gotcha it exposes: a packet hitting ct state invalid drop because the host saw the reply but never the SYN — asymmetric routing, not a firewall mistake. Another: rules ordered so that a broad accept shadows a later drop you expected to match. The trace shows the first terminal verdict and stops, which is precisely the rule you need to move.

When the ruleset misbehaves after an edit, dump it with handles and delete surgically rather than reloading the whole file:

nft -a list chain inet filter input
#   tcp dport 22 ip saddr @allowed_ssh accept # handle 7
nft delete rule inet filter input handle 7

Rate Limiting Without a Sidecar Daemon

nftables can do the fail2ban job itself with a dynamic set whose elements expire, plus the limit rate object. A set keyed on source address with a per-element timeout becomes a self-cleaning offender list:

table inet filter {
    set flood {
        type ipv4_addr
        flags dynamic, timeout
        timeout 1m
    }
    chain input {
        type filter hook input priority 0; policy drop;
        ct state established,related accept

        # More than 10 new SSH conns/min from one source -> add to set, drop
        tcp dport 22 ct state new \
            add @flood { ip saddr limit rate over 10/minute } drop

        tcp dport 22 ct state new accept
    }
}

The limit rate over 10/minute matches only when the source exceeds the rate, at which point add @flood { ip saddr ... } inserts it with the set’s 1-minute timeout and the packet is dropped. No external daemon, no reload — the kernel ages entries out on its own. Inspect the live offender list with its remaining timers:

nft list set inet filter flood
# set flood {
#     type ipv4_addr
#     elements = { 203.0.113.66 expires 47s }
# }

This is the pattern that makes named sets the headline feature: the data plane mutates its own state at packet rate, and the ruleset stays static.

Migrating Without a Big Bang

You do not have to convert everything at once. iptables-translate converts individual rules so you can learn the mapping:

iptables-translate -A INPUT -p tcp --dport 22 -j ACCEPT
# nft add rule ip filter INPUT tcp dport 22 counter accept

Translate your existing ruleset, read it, then rewrite it properly using sets and maps — the mechanical translation works but misses the whole point. The value of nftables is not 1:1 rule parity; it is that a firewall built around named sets and verdict maps is shorter, faster, and actually readable six months later when you have to change it.