When Network Latency Becomes a Trading Strategy

December 4, 2020

Peter Bieda

Author

“The fastest code isn’t always the smartest. Sometimes, anticipating where latency will hit gives you more edge than reducing it.”

In high-frequency trading, we love to obsess over nanoseconds. Engineers burn weeks shaving microseconds off garbage collectors, syscall paths, cache misses, and messaging loops. We argue about kernel bypass, lock-free rings, NUMA pinning, and C++ memory ordering like it’s some kind of religious debate.

But here’s a hard truth that took me years to actually internalize:

Beating latency is good.
Predicting where latency will hit is even better.

There are entire strategies—not just optimizations—that come from understanding how packets move, where queues form, and how the exchange microstructure responds under load. In other words:

Network latency isn’t just a constraint.
It’s an alpha source.

This article is about how I learned that lesson, the experiments I ran, and why most trading shops underestimate the strategic dimension of latency.

1. Latency Isn’t a Constant—It’s a Distribution

When you first profile a system, you look at average latency:

“6 microseconds ping time.”
“12 microseconds tick-to-trade.”
“3 microseconds serialization.”

Those numbers are meaningless by themselves.

What matters is the distribution—especially the tail. Trading systems don’t operate at the mean; they operate at the worst 0.1% moments when:

the packet queue backs up,
the NIC is under interrupt flood,
an exchange gateway spikes load,
multiple instruments cross signals simultaneously.

The mistake most junior engineers make (me included years ago) is thinking:

“If I reduce mean latency by 10%, I’ll be faster.”

Not necessarily.
Sometimes you gain more alpha by predicting latency regimes and adjusting behavior accordingly.

For example:

If you know the exchange is entering a high-load auction period, you might stop quoting aggressively because your tail risk widens.
If you know your own gateway is peaking, you might reduce order cancellation spam to avoid queue buildup.
If you know a peer venue is lagging by 400–600µs due to a known crossing-session spike, you might lean your quote in anticipation.

Fast code is one thing.
Latency-aware behavior is another.

2. Example: When I Stopped Chasing “Faster” and Started Chasing “Predictable”

One of my biggest breakthroughs happened while profiling a UDP multicast feed handler. We were already doing the usual:

kernel bypass (DPDK)
huge pages
pinned CPU cores
lock-free queues
NIC RSS tuning
jumbo frames off
L1/L2 warm-path specialization

We shaved everything we could. We got from 9µs median processing time to under 5µs.

But the slow path still had occasional 40–50µs spikes.

Those spikes didn’t come from our code—they came from:

interrupt coalescing behavior on the NIC,
bursts of exchange traffic during opening/closing rotations,
a misaligned memory layout in a vendor feed,
cache line evictions triggered by a secondary telemetry thread.

Even after fixing what we could, the spikes remained.

Then I realized something obvious:

If the slow paths are unavoidable, build a strategy around them.

So instead of chasing another microsecond, I did something else:

I classified incoming market conditions into latency states.

Normal regime
- feed latency stable
- book updates smooth
- cancellations quick
  → Quote normally.
High-burst regime
- packet clusters detected
- NIC queue depth growing
- exchange sequence gaps widening
  → Reduce quote aggressiveness and widen.
Degraded regime
- tail latency spikes
- out-of-order packets
- CPU telemetry showing L3 thrashing
  → Cancel all microstructure-sensitive orders.

This produced more PnL impact than the micro-optimizations.

Why?
Because when the exchange is under stress, adverse selection risk spikes. Being “fast” doesn’t matter when everyone is slow.

Being right matters.

3. A Simple Simulation to Show Why Latency Regimes Matter

To illustrate this, I wrote a tiny Python simulation (not production, but great for intuition).

The idea:
Two market-makers, both identical except one reacts to latency regimes.

Simplified code example:

import numpy as np

def simulate(latency_mode):
    mid = 100.0
    pnl = 0

    for t in range(20000):
        price_move = np.random.randn() * 0.01
        mid += price_move

        # latency regime simulation
        if latency_mode == "adaptive":
            if abs(price_move) > 0.02:   # bursty regime
                spread = 0.05
            else:
                spread = 0.01
        else:
            spread = 0.01  # naive always-tight quoting

        # fills & slippage simulate adverse selection
        if np.random.rand() < 0.5:
            pnl += spread  # earn spread
        else:
            pnl -= abs(price_move)  # get hit on toxic flow

    return pnl

naive = simulate("naive")
adaptive = simulate("adaptive")

print("Naive:", naive)
print("Adaptive:", adaptive)

You’ll see the naive strategy gets crushed during volatility bursts.

The latency-aware strategy performs significantly better.

This tiny example mirrors real microstructure:
When markets move fast, tail latency increases, and adverse selection becomes dominant. Anticipation beats speed.

4. Latency Is Geometry, Not Just Speed

One of my favorite realizations is that latency has shape.

In cross-venue arbitrage, your latency triangle might look like this:

You → CME → You
You → NYSE → You
CME → NYSE
NYSE → CME

Each segment behaves differently during:

auctions
imbalances
replay events
quote bursts
batch processing cycles

If you know:

CME → NYSE latency widens during high imbalance,
and your quoting logic links both venues,

you can pre-adjust quotes on the faster venue before the slower one catches up.

This is real alpha.
Not from being faster, but from knowing when other participants become slower.

5. When I Used Latency Geometry to Improve a Spread-Cross Strategy

We had a simple cross-venue strategy:

If price_A < price_B - spread → arbitrage

Classic stuff. Already microsecond optimized.

But the strategy had one weakness:
It sometimes fired late when venue B lagged.

Instead of optimizing serialization further, I profiled latency patterns between venues over days.

I found a recurring pattern:

Between 9:29:30 and 9:30:00 ET,
venue B’s outbound feed consistently lagged by 300–500µs
due to pre-open auction buildup.

So instead of treating both venues equally, I added:

if in_preopen_period:
    lean_price_on_A_by_expected_latency_slippage()

Result?

Fewer false signals
Better cross-venue timing
Reduced inventory excursions
Higher Sharpe

All from simply anticipating latency distortion.

6. When the Smartest Move Is Not to Reduce Latency

There are cases where reducing latency actually makes things worse.

Example: too-fast cancel loops

If your cancellation speed is extremely high while competitors are slower, you may:

overcrowd your own queue,
trigger exchange throttles,
lose queue position,
or fall into retry loops.

Sometimes adding a rate limiter creates more stability.

Example: too-fast consumption of incomplete book data

If you process book updates faster than the exchange snapshot recovery rate, you may build:

inconsistent books,
temporary negative depth,
false imbalances.

Sometimes adding microdelays (10–30µs) produces cleaner book state.

Example: overreacting to microbursts

If you react to every tick instantly, you may be “the fastest loser” during momentum runs.

Sometimes waiting 30–50µs to see if a burst stabilizes is more profitable.

Fast ≠ smart.
Fast + selective = alpha.

7. Latency as a First-Class Citizen in Strategy Design

At some point, I stopped thinking of latency as “performance engineering” and started treating it like:

a signal,
a feature,
a risk regime,
an adversarial environment,
a state variable,
a competitive weapon.

This mindset shift helped me design strategies that:

reduce exposure during gateway congestion,
stop quoting during feed desync,
lean quotes based on expected cross-venue propagation,
correct for predictable midday network jitter,
hold inventory differently during imbalance-driven latency spikes.

Latency is part of the market, not outside it.

Ignoring it leaves alpha on the table.

8. A Simple Diagram for Your Portfolio Page

You can paste this into Markdown or Sanity CMS:

          ┌──────────────────────────┐
          │   Market Data Enters     │
          │   (Variable Latency)     │
          └──────────────┬───────────┘
                         ↓
        ┌──────────────────────────────────────┐
        │   Latency Classification Layer       │
        │   - Normal                           │
        │   - High Burst                       │
        │   - Degraded                         │
        └─────────────────┬────────────────────┘
                          ↓
            ┌────────────────────────┐
            │ Strategy Behavior       │
            │ - Tighten/Widen Quotes │
            │ - Cancel Orders        │
            │ - Lean Cross-Venue     │
            └───────────┬────────────┘
                        ↓
        ┌──────────────────────────────────────┐
        │     Execution Engine (C++/FPGA)      │
        │     Applies Behavior at Wire Speed   │
        └──────────────────────────────────────┘

It shows latency not as an optimization problem, but as a decision input.

9. Conclusion: The New Philosophy of Latency

If you want to compete in modern electronic markets, you need both:

Fast code, yes.
But also: latency-aware strategy.

The shops winning today aren’t merely the ones shaving nanoseconds. They’re the ones:

predicting exchange load,
modeling gateway congestion,
detecting regime changes mid-session,
timing order placement based on expected propagation delays,
anticipating competitor reaction time under stress,
leaning books based on cross-venue microbursts.

Latency isn’t just a technical constraint.

It’s part of microstructure.
It’s part of the edge.
It’s part of the strategy.

And once you start designing with that mindset, your entire architecture changes—from your feed handler, to your quoting model, to your risk engine.

Sometimes the fastest system wins.
But more often:

The system that understands latency wins.