Kubernetes networking has a dirty secret: on a default cluster, every Service is implemented by kube-proxy writing iptables rules, and on a busy cluster that becomes thousands of sequential rules the kernel walks per packet. It works, but it does not scale gracefully, and it gives you almost no visibility into what is actually talking to what.
Cilium throws that out and builds the dataplane on eBPF. The practical results: Service load balancing without the iptables tax, network policy that follows workloads instead of IP addresses, and flow logs that finally answer “who is connecting to this pod.”
Why eBPF Changes the Model
kube-proxy thinks in IPs and iptables rules. Pods churn, IPs recycle, and rules pile up. Cilium attaches eBPF programs at the socket and driver level and thinks in identities — a label-derived identity assigned to each workload. Policy is expressed against identities, so a rule like “frontend may talk to backend” keeps holding as pods are rescheduled and IPs change underneath.
That is the conceptual leap: stop writing policy about where a workload is (IP) and write it about what it is (labels/identity).
Installing as the CNI
Cilium is a CNI plugin; install it on a cluster with no other CNI (or migrate carefully). Via Helm, turning on the two features that matter most — kube-proxy replacement and Hubble:
helm install cilium cilium/cilium --namespace kube-system \ --set kubeProxyReplacement=true \ --set k8sServiceHost=10.0.0.1 \ --set k8sServicePort=6443 \ --set hubble.relay.enabled=true \ --set hubble.ui.enabled=truekubeProxyReplacement=true means Cilium handles Service load balancing in eBPF and you can remove kube-proxy entirely. Check status:
cilium statuscilium status --verbose | grep -i kubeproxy # should report "True"Network Policy That Follows Workloads
Standard Kubernetes NetworkPolicy is L3/L4 and namespace/label scoped. CiliumNetworkPolicy extends it to L7 (HTTP, DNS, Kafka) and richer selectors. A baseline L3/L4 policy — backend accepts only from frontend, only on 8080:
apiVersion: cilium.io/v2kind: CiliumNetworkPolicymetadata: name: backend-allow-frontend namespace: shopspec: endpointSelector: matchLabels: app: backend ingress: - fromEndpoints: - matchLabels: app: frontend toPorts: - ports: - port: "8080" protocol: TCPThe L7 layer is where Cilium pulls ahead — restrict not just the port but the HTTP methods and paths:
ingress: - fromEndpoints: - matchLabels: app: frontend toPorts: - ports: - port: "8080" protocol: TCP rules: http: - method: "GET" path: "/api/v1/.*" - method: "POST" path: "/api/v1/orders"Now frontend can GET /api/v1/* and POST only to orders — enforced in the dataplane, no sidecar. This is “service mesh” L7 policy without running Envoy next to every pod.
DNS-Aware Egress
A perennial problem: locking down egress to external services that live behind changing IPs. Cilium policies can match DNS names, and it learns the IPs by snooping the pod’s own DNS lookups:
egress: - toFQDNs: - matchName: "api.stripe.com" toPorts: - ports: - port: "443" protocol: TCP - toEndpoints: - matchLabels: "k8s:io.kubernetes.pod.namespace": kube-system k8s-app: kube-dns toPorts: - ports: [{ port: "53", protocol: UDP }]You must allow DNS to kube-dns for toFQDNs to work — Cilium watches those lookups to populate the allowed IP set. Forgetting the DNS rule is the most common “my FQDN policy does nothing” bug.
Hubble: Finally, Visibility
The piece that justifies the migration on its own. Hubble taps the eBPF dataplane and shows real flows, including which policy verdict each got:
# Live flows for a namespacehubble observe --namespace shop
# Only DROPPED flows — what is policy blocking right now?hubble observe --verdict DROPPED --namespace shop
# Flows to a specific workloadhubble observe --to-label app=backendhubble observe --verdict DROPPED is how you debug a policy that is too tight: you see the exact source, destination, and port being denied, instead of guessing why a pod cannot reach a service. The Hubble UI renders the same data as a live service map.
Default-Deny, the Safe Way
Policies are additive and default-allow until any policy selects an endpoint, at which point that endpoint is default-deny for the covered direction. The trap: apply a default-deny before you have allow rules and you cut traffic instantly. Stage it:
- Run with Hubble and no enforcement; learn the real flows from
hubble observe. - Write allow policies matching those flows.
- Apply a namespace default-deny once the allows are in place.
apiVersion: cilium.io/v2kind: CiliumNetworkPolicymetadata: { name: default-deny, namespace: shop }spec: endpointSelector: {} ingress: [] egress: []That empty-rule policy denies everything in the namespace — apply it last, never first.
Debugging When Hubble Isn’t Enough
Hubble answers “what flowed and what verdict.” When a policy should allow traffic that is still dropped, you drop to the endpoint and identity layer. Every pod is a Cilium endpoint with a numeric identity derived from its labels; policy is compiled against those identities, so a stale or wrong identity is a frequent root cause:
# List endpoints with their identity and enforcement statecilium endpoint list# ENDPOINT IDENTITY LABELS INGRESS EGRESS# 1423 12influx k8s:app=backend Enabled Disabled
# What policy is actually loaded for one endpointcilium endpoint get 1423
# Identity to labels mapping — confirm two pods you expect to share identity docilium identity listIf two backend pods landed on different identities, a label typo split them and your fromEndpoints: app=frontend rule only covers one. When the verdict itself is the mystery, cilium monitor is the dataplane trace — it shows policy decisions per packet with the identities involved:
# Drops only, with the L3/L4 detail and the policy verdictcilium monitor --type drop# xx drop (Policy denied) flow 0x0 to endpoint 1423, identity 53713->12influx: \# 10.0.2.7:51234 -> 10.0.4.9:8080 tcp SYNidentity 53713->12influx is the actual evaluation: source identity 53713 to destination 12influx was denied. Cross-reference 53713 with cilium identity list and you usually find the caller is not the workload you assumed (a sidecar, an init container, or traffic arriving from outside the cluster with the world identity).
The toFQDNs Gotcha and How It Actually Resolves
FQDN policy is the most common source of “it works for an hour then breaks.” The mechanism: Cilium proxies the pod’s DNS request, records the returned IPs, and inserts them into the allowed set with a TTL. Two failure modes follow directly from that.
First, if the application caches DNS longer than Cilium’s recorded TTL, the pod keeps using an IP Cilium has since expired, and traffic drops. Cilium honors a minimum TTL you can raise, and it is worth pinning DNS settings explicitly rather than trusting the upstream record:
apiVersion: cilium.io/v2kind: CiliumNetworkPolicymetadata: name: egress-stripe namespace: shopspec: endpointSelector: matchLabels: app: backend egress: - toEndpoints: - matchLabels: "k8s:io.kubernetes.pod.namespace": kube-system k8s-app: kube-dns toPorts: - ports: - port: "53" protocol: UDP rules: dns: - matchPattern: "*.stripe.com" - toFQDNs: - matchPattern: "*.stripe.com" toPorts: - ports: - port: "443" protocol: TCPThe rules.dns.matchPattern on the port 53 rule is the part people omit. Without it Cilium forwards DNS but does not learn the answers, so toFQDNs never populates — the egress to kube-dns must itself carry the DNS-matching rule for the FQDN intercept to fire. Confirm what Cilium actually learned:
cilium fqdn cache list | grep stripe# 10.0.3.4 default/backend *.stripe.com api.stripe.com 34.x.x.x TTL 86sIf that cache is empty while the pod resolves the name fine, your DNS visibility rule is wrong, not the FQDN rule.
Second failure mode: a CDN behind the FQDN returns dozens of rotating IPs with short TTLs. The allowed set churns, and a connection opened to an IP that has since aged out gets cut mid-flight. For high-cardinality endpoints, prefer a wildcard matchPattern over per-host matchName so any IP the domain hands back is covered, and confirm the policy compiled by checking it landed in the dataplane rather than just the API server:
cilium policy get# verify the toFQDNs selector appears in the loaded policy revision
# Bump the revision after applying; if it doesn't increment, the policy was rejectedcilium endpoint get 1423 -o jsonpath='{[0].status.policy.realized.policy-revision}'A policy that validates against the CRD schema but references a selector matching no identity loads silently and does nothing — the revision incrementing without the expected Enabled enforcement on the endpoint is the tell.
What You Get and What It Costs
The wins are real: Service load balancing that scales past iptables, identity-based and L7 policy without sidecars, DNS-aware egress, and Hubble visibility. The cost is operational depth — Cilium is a bigger, more opinionated component than a basic CNI, and the eBPF dataplane means debugging sometimes drops to cilium monitor and BPF map dumps rather than reading iptables.
For a small cluster running a few apps, a simple CNI is less to operate. For a cluster where you need real network policy, Service scale, and the ability to answer “what talked to what,” Cilium is the current default for good reason — and Hubble alone will change how you debug the cluster.