Simulating Link Failure and Latency with tc and netem

A lab on a switch fabric or a bunch of VMs on one host has a dirty secret: every link is perfect. Zero latency, zero loss, infinite reliability. So your routing converges instantly, your TCP tuning looks flawless, and your failover is crisp — none of which tells you how any of it behaves across an actual WAN with 80ms of latency, a bit of jitter, and the occasional dropped packet. The gap between “works in the lab” and “works in production” is largely the gap between perfect links and real ones.

Linux closes that gap for free. `tc` (traffic control) with the `netem` (network emulation) queueing discipline lets you impose latency, jitter, loss, reordering, and corruption on any interface, and combine it with rate limiting to model a bandwidth-constrained circuit. If your lab devices are Linux — containers, VMs, network namespaces, CSR/cEOS/FRR nodes — you can make their links lie convincingly.

## The mental model: qdiscs on an egress interface

`tc` attaches a *queueing discipline* (qdisc) to an interface, and netem is a qdisc that delays, drops, or mangles packets as they leave. Two things to keep in front of you.

First, netem applies to **egress** — traffic leaving the interface. To emulate a symmetric link you apply it on both ends, or on both interfaces of a middle “impairment” node that sits between the two devices you care about.

Second, everything below is `sudo`, changes are immediate, and they vanish on reboot (or when you delete them), which makes this safe to experiment with. The reset command is the most important one to memorise:

“`text
# Remove all emulation from an interface, back to a perfect link
sudo tc qdisc del dev eth0 root
“`

If anything gets weird, that line puts you back to normal.

## Adding latency

The simplest useful impairment is fixed delay. Add 80ms of one-way latency to `eth0`:

“`text
sudo tc qdisc add dev eth0 root netem delay 80ms
“`

Confirm it took:

“`text
$ tc qdisc show dev eth0
qdisc netem 8001: root refcnt 2 limit 1000 delay 80.0ms
“`

And feel it — remember a round trip crosses the delay twice, so 80ms each way shows up as ~160ms RTT:

“`text
$ ping -c 3 10.0.0.2
64 bytes from 10.0.0.2: icmp_seq=1 ttl=64 time=160 ms
64 bytes from 10.0.0.2: icmp_seq=2 ttl=64 time=160 ms
64 bytes from 10.0.0.2: icmp_seq=3 ttl=64 time=160 ms
“`

Real circuits aren’t perfectly steady, so add jitter — a variation around the mean. Here, 80ms ± 10ms:

“`text
sudo tc qdisc change dev eth0 root netem delay 80ms 10ms
“`

By default that jitter is uniform, which is unrealistic. Real latency clusters around the mean with occasional outliers, so pull it toward a normal distribution and add correlation so consecutive packets resemble each other (as they do on a real link):

“`text
sudo tc qdisc change dev eth0 root netem delay 80ms 10ms 25% distribution normal
“`

That `25%` is correlation between successive delays — each packet’s delay is partly a function of the last one’s, which is how real queues actually behave.

## Adding loss, reordering, and corruption

Packet loss is where lab assumptions go to die. Add 1% random loss:

“`text
sudo tc qdisc change dev eth0 root netem loss 1%
“`

Uniform random loss is still a simplification — real loss comes in bursts. netem’s Gilbert-Elliott model captures that with good/bad state probabilities, so you can emulate a link that’s mostly clean but occasionally drops several packets in a row:

“`text
# loss — chance of entering/leaving the “bad” (lossy) state
sudo tc qdisc change dev eth0 root netem loss gemodel 1% 10%
“`

You can stack impairments in one qdisc — a realistic mediocre WAN in a single line:

“`text
sudo tc qdisc add dev eth0 root netem \
delay 80ms 10ms distribution normal \
loss 0.5% \
reorder 2% 50% \
corrupt 0.1%
“`

That reads as: 80ms ± 10ms latency, half a percent loss, 2% of packets reordered (with 50% correlation), and one in a thousand corrupted. It’s a nasty little link, and it will surface bugs a perfect link never would.

## Constraining bandwidth

netem does delay and loss; to cap throughput you pair it with a rate-limiting qdisc like `tbf` (token bucket filter). The clean way is to chain them — netem for impairment, tbf for the speed limit — modelling, say, a 10 Mbit/s WAN with 50ms latency:

“`text
# Root: token bucket filter caps the rate to 10 Mbit
sudo tc qdisc add dev eth0 root handle 1: tbf \
rate 10mbit burst 32kbit latency 400ms

# Child: netem adds the latency underneath the rate cap
sudo tc qdisc add dev eth0 parent 1:1 handle 10: netem delay 50ms
“`

Now a large transfer across this interface behaves like it’s crossing a modest circuit — throughput is bounded, latency is real, and TCP has to actually work for its window. This is where you discover that your file-sync job or your database replication has assumptions baked in that only hold on a LAN.

## Building an impairment node between two devices

The tidiest lab pattern is a dedicated “WAN” node in the middle. Two of your real devices connect to a Linux box (or namespace) that does nothing but impair traffic, so neither device under test needs any special config — the ugliness lives in one place:

“`text
[ router-A ] eth1 —- eth1 [ impair ] eth2 —- eth1 [ router-B ]
“`

Apply asymmetric conditions to model a real path where the two directions differ:

“`text
# A -> B direction: satellite-ish, high latency
sudo tc qdisc add dev eth2 root netem delay 300ms 20ms loss 0.5%

# B -> A direction: cleaner, lower latency
sudo tc qdisc add dev eth1 root netem delay 40ms 5ms
“`

Because impairment is centralised, you can twist the knobs mid-test without logging into the devices — which is exactly what you want when you’re watching how something reacts to a link degrading in real time.

## What this actually lets you test

The point of all this isn’t the commands, it’s the behaviours they expose. A few that reliably differ between a perfect lab and an impaired one.

Routing convergence looks very different when hellos and updates can be delayed and dropped. Crank loss up on a link and watch how long your IGP or BGP takes to notice, and whether your timers are tuned for the real world or for a lab that never drops anything. TCP-dependent applications reveal their true throughput once real latency forces the window to grow — a backup that saturates a LAN can crawl across 200ms of RTT, and better to learn that here. Failover timing becomes honest: bring the impairment node to full loss to simulate a hard cut, or ramp loss gradually to simulate a link that’s dying rather than dead, which is often the messier failure to handle.

“`text
# Simulate a hard link failure without touching cabling
sudo tc qdisc change dev eth0 root netem loss 100%

# …observe reconvergence, then restore the link
sudo tc qdisc del dev eth0 root
“`

That pair — 100% loss to cut, `del` to restore — is a cleaner, more repeatable failure test than yanking a virtual cable, because you control it precisely and it comes back identically every time.

## Keep it repeatable

The last habit worth forming: put your impairment profiles in scripts, not your shell history. A named function per scenario makes tests reproducible and lets you switch conditions in a keystroke:

“`text
#!/usr/bin/env bash
IF=eth0
reset() { sudo tc qdisc del dev $IF root 2>/dev/null; }
wan() { reset; sudo tc qdisc add dev $IF root netem delay 80ms 10ms loss 0.5%; }
satlink() { reset; sudo tc qdisc add dev $IF root netem delay 300ms 30ms loss 1%; }
cut() { reset; sudo tc qdisc add dev $IF root netem loss 100%; }

“$@” # call as: ./impair.sh wan | ./impair.sh cut | ./impair.sh reset
“`

Now “make this link behave like a satellite hop” is `./impair.sh satlink`, and the whole team runs the same test the same way.

Perfect links are the most misleading thing about a lab. A few lines of `tc` and `netem` turn that lab into something that argues with you the way production will — and it’s far better to have that argument on a Tuesday afternoon than during a real cutover.

Share this:

Leave a Reply

Your email address will not be published. Required fields are marked *