ARP and Neighbor Discovery: Troubleshooting Layer 2 Problems

Routing is correct. Firewall allows traffic. Ping fails. You spend an hour checking Layer 3 and 4. The problem is Layer 2.

ARP (IPv4) and Neighbor Discovery (IPv6) map IP addresses to MAC addresses. When this mapping fails, packets can’t be delivered — even though routing looks perfect.

Layer 2 problems look like Layer 3 failures. Always check ARP.

Understanding ARP

Host A wants to send packet to 192.168.1.100:
1. Check ARP cache: "Do I know the MAC for 192.168.1.100?"
2. If no: Send ARP request (broadcast)
"Who has 192.168.1.100? Tell 192.168.1.1"
3. 192.168.1.100 replies (unicast)
"192.168.1.100 is at aa:bb:cc:dd:ee:ff"
4. Cache entry created, packet sent

If ARP fails, IP packet can’t be sent. Looks like routing problem, but it’s MAC resolution.

Viewing ARP Table

VyOS Commands

Terminal window
# Show ARP table
show arp
# Output:
# IP Address HW Address Flags Interface
# 192.168.1.100 aa:bb:cc:dd:ee:ff C eth1
# 192.168.1.101 bb:cc:dd:ee:ff:00 C eth1
# 192.168.1.1 (incomplete) eth1 ← Problem!
# C = Complete (resolved)
# (incomplete) = ARP request sent, no reply

Detailed ARP Info

Terminal window
# Using ip command
ip neigh show
# Output:
# 192.168.1.100 dev eth1 lladdr aa:bb:cc:dd:ee:ff REACHABLE
# 192.168.1.101 dev eth1 lladdr bb:cc:dd:ee:ff:00 STALE
# 192.168.1.102 dev eth1 FAILED
# States:
# REACHABLE - Recently verified
# STALE - Not verified recently
# DELAY - Verification pending
# PROBE - Actively verifying
# FAILED - ARP resolution failed
# PERMANENT - Static entry

Filter by Interface

Terminal window
# ARP entries for specific interface
ip neigh show dev eth1
# ARP entries for specific IP
ip neigh show 192.168.1.100

ARP Problems and Solutions

Problem 1: Incomplete ARP Entry

Terminal window
# Symptom:
show arp
# 192.168.1.100 (incomplete) eth1
# Causes:
# - Target host is down
# - Target host has wrong IP
# - Target host on different VLAN
# - Network issue between hosts
# Debug:
# 1. Capture ARP traffic
sudo tcpdump -i eth1 arp
# 2. See if requests go out, responses come back
# 08:30:01 ARP, Request who-has 192.168.1.100 tell 192.168.1.1
# (no reply = host unreachable at Layer 2)
# 3. Verify VLAN tagging
show interfaces ethernet eth1

Problem 2: Wrong MAC Address

Terminal window
# Symptom: Traffic goes to wrong host
# Check ARP for expected IP
show arp | grep 192.168.1.100
# If MAC doesn't match expected host:
# - Duplicate IP (two hosts same IP)
# - IP moved to different host
# - ARP spoofing attack
# Clear entry and let it re-resolve
ip neigh del 192.168.1.100 dev eth1
ping 192.168.1.100
show arp | grep 192.168.1.100

Problem 3: Stale ARP Entries

Terminal window
# Symptom: Intermittent connectivity after IP change
# Old MAC cached, traffic goes to wrong place
ip neigh show
# 192.168.1.100 dev eth1 lladdr aa:bb:cc:dd:ee:ff STALE
# Flush stale entry
ip neigh flush 192.168.1.100
# Or flush all on interface
ip neigh flush dev eth1

Problem 4: ARP Table Full

Terminal window
# Symptom: New hosts can't connect
# Check table size
cat /proc/sys/net/ipv4/neigh/default/gc_thresh3
# Default: 1024
# If many hosts, increase:
configure
set system sysctl parameter net.ipv4.neigh.default.gc_thresh3 value 4096
commit
# Or via sysctl directly:
sysctl -w net.ipv4.neigh.default.gc_thresh3=4096

Static ARP Entries

For critical hosts, use static ARP to prevent spoofing:

Terminal window
configure
# Add static ARP entry
set protocols static arp 192.168.1.100 hwaddr aa:bb:cc:dd:ee:ff
commit

When to Use Static ARP

  • Critical servers (DNS, gateway)
  • Security-sensitive hosts
  • Environments with ARP spoofing risk
  • Fixed infrastructure (won’t change MAC)

Proxy ARP

Router answers ARP on behalf of other networks:

Terminal window
# Check if proxy ARP is enabled
cat /proc/sys/net/ipv4/conf/eth1/proxy_arp
# Enable proxy ARP on interface
configure
set interfaces ethernet eth1 ip enable-proxy-arp
commit
# Use case: When hosts on different subnets share same VLAN
# Router answers ARP for remote subnet, forwards traffic

Proxy ARP Risks

  • Breaks subnet boundaries
  • Can cause routing confusion
  • Security implications (answers for others)
  • Usually sign of network design problem

IPv6 Neighbor Discovery

IPv6 uses ICMPv6 Neighbor Discovery instead of ARP:

View Neighbor Table

Terminal window
# Show IPv6 neighbors
ip -6 neigh show
# Output:
# fe80::1 dev eth1 lladdr aa:bb:cc:dd:ee:ff REACHABLE
# 2001:db8::100 dev eth1 lladdr bb:cc:dd:ee:ff:00 STALE

Neighbor Discovery Types

NDP Message Types:
- Neighbor Solicitation (NS): "Who has this IPv6?"
- Neighbor Advertisement (NA): "I have this IPv6"
- Router Solicitation (RS): "Are there any routers?"
- Router Advertisement (RA): "I'm a router, here's the prefix"

Debug ND Issues

Terminal window
# Capture NDP traffic
sudo tcpdump -i eth1 icmp6
# Filter for specific types
sudo tcpdump -i eth1 'icmp6 and ip6[40] == 135' # Neighbor Solicitation
sudo tcpdump -i eth1 'icmp6 and ip6[40] == 136' # Neighbor Advertisement

IPv6 ND Problems

Terminal window
# Problem: Duplicate Address Detection fails
# Host won't configure IPv6 address
# Check for duplicate:
sudo tcpdump -i eth1 'icmp6 and ip6[40] == 136'
# Problem: No router advertisements
# Hosts can't find gateway
# Check RA on interface:
sudo tcpdump -i eth1 'icmp6 and ip6[40] == 134'
# VyOS sends RA if configured:
set interfaces ethernet eth1 ipv6 router-advert prefix 2001:db8::/64

Duplicate IP Detection

Detecting Duplicates

Terminal window
# Check for multiple MACs responding to same IP
arping -D -I eth1 192.168.1.100
# If duplicate exists, arping gets responses from multiple MACs
# Or capture ARP and look for different MAC sources
sudo tcpdump -i eth1 arp and host 192.168.1.100

Gratuitous ARP

Terminal window
# Send gratuitous ARP (announce IP)
arping -A -I eth1 192.168.1.1 -c 1
# Use after IP address change or failover
# Updates ARP caches network-wide

Common Scenarios

Scenario 1: New Server Not Reachable

Terminal window
# Server configured, can't reach from router
ping 192.168.1.100
# PING 192.168.1.100: 56 data bytes
# (no response)
show arp | grep 192.168.1.100
# 192.168.1.100 (incomplete)
# ARP not resolving:
# - Server on wrong VLAN?
# - Server IP configured wrong?
# - Server interface down?
# From server side:
# ip addr show
# Check if IP is on correct interface

Scenario 2: Traffic Goes to Wrong Host

Terminal window
# Application connecting to wrong server
show arp | grep 10.0.0.50
# 10.0.0.50 aa:bb:cc:dd:ee:ff C eth1
# But expected MAC was bb:cc:dd:ee:ff:00
# Duplicate IP! Two hosts have 10.0.0.50
# Solution:
# 1. Find both hosts
# 2. Remove duplicate IP from wrong host
# 3. Flush ARP
ip neigh flush 10.0.0.50

Scenario 3: Connectivity Works Then Fails

Terminal window
# Works initially, fails after some time
# Check ARP timeout
cat /proc/sys/net/ipv4/neigh/default/base_reachable_time_ms
# 30000 (30 seconds)
# Entry goes STALE, then needs refresh
# If refresh fails → connectivity lost
# Debug:
watch -n 1 'ip neigh show 192.168.1.100'
# Watch state transition

Scenario 4: After Failover, Old IP Unreachable

Terminal window
# Failover happened, but clients still sending to old MAC
# Need gratuitous ARP from new server:
arping -A -I eth1 192.168.1.100 -c 3
# Or clear ARP cache on clients/routers:
ip neigh flush 192.168.1.100

Monitoring ARP

Watch ARP Table

Terminal window
# Continuous monitoring
watch -n 2 'ip neigh show dev eth1'

Log ARP Changes

Terminal window
# Linux doesn't log ARP by default
# Use arpwatch for monitoring:
apt install arpwatch
arpwatch -i eth1 -f /var/lib/arpwatch/eth1.dat
# Logs to syslog:
# new station 192.168.1.100 aa:bb:cc:dd:ee:ff
# changed ethernet address 192.168.1.100 old:mac new:mac

Best Practices

1. Static ARP for Critical Infrastructure

Terminal window
# Gateway, DNS, critical servers
set protocols static arp 192.168.1.1 hwaddr aa:bb:cc:dd:ee:ff

2. Monitor for Duplicates

Terminal window
# Regular scan for duplicates
for ip in $(seq 1 254); do
arping -D -c 1 -I eth1 192.168.1.$ip 2>/dev/null
done

3. Clear ARP During Troubleshooting

Terminal window
# When changing IPs or after failover
ip neigh flush dev eth1

4. Check ARP First

Terminal window
# Before deep Layer 3 debugging
show arp | grep <problem-ip>

The Lesson

Layer 2 problems look like Layer 3 failures. Always check ARP.

When ping fails:

  1. Is there an ARP entry?
  2. Is it complete or incomplete?
  3. Is the MAC address correct?
  4. Is the entry REACHABLE or STALE/FAILED?

Layer 2 issues cause:

  • Intermittent connectivity (stale entries)
  • Wrong destination (wrong MAC)
  • Complete failure (no entry)
  • Slow performance (ARP delays)

ARP is simple but foundational. When it breaks, nothing above it works. Check it first, not last.