English· Español· Deutsch· Nederlands· Français· 日本語· ქართული· 繁體中文· 简体中文· Português· Русский· العربية· हिन्दी· Italiano· 한국어· Polski· Svenska· Türkçe· Українська· Tiếng Việt· Bahasa Indonesia

un

guest
1 / ?
back to lessons

Each PoP Owns a Cell of the Plane

Voronoi Tessellation

Place N points (PoPs: Points of Presence) on a plane. Draw region boundaries such that every point in the plane is assigned to the nearest PoP. The result is a Voronoi diagram: the plane partitions into N cells, one per PoP, each containing all points closer to its PoP than to any other.

CDN geometry: every user's request routes to the nearest PoP. Each PoP serves a cell of the geographic plane. The cell boundaries are the perpendicular bisectors of the lines between neighboring PoPs.

Geometric reading:

- Adding a PoP shrinks the cells of its neighbors (& creates a new cell)

- Removing a PoP forces its cell to redistribute to neighbors (capacity surge at neighbors)

- A user near a cell boundary may flip between PoPs as load balancing shifts

- A failure of one PoP enlarges every neighbor's effective cell during the outage

Operational consequence: when a PoP fails, its load doesn't disappear; it migrates to neighboring PoPs. If neighbors are sized only for their normal cell, the surge breaks them next (cascading PoP failure). Mature CDN providers size each PoP with neighbor-failure surge in mind.

Voronoi cells around PoPs; one PoP failure enlarges neighbor cells

A CDN runs 4 PoPs serving the continental US: West Coast, Mountain, Midwest, East Coast. Each is sized for ~25% of total US traffic. The Mountain PoP fails. Predict: (1) which neighbor PoPs absorb the redistributed load, & in roughly what proportion, & (2) what happens if those neighbors were sized exactly for 25% each with no headroom.

The Triangle Inequality You Cannot Cheat

Physics Sets the Floor

Light travels at ~300,000 km/s in vacuum. In fiber, about 200,000 km/s due to refractive index. That means:

- ~1,000 km of fiber = ~5 ms one-way = ~10 ms round-trip

- Coast-to-coast US (~5,000 km) = ~50 ms RTT minimum

- US to Europe (~8,000 km) = ~80 ms RTT minimum

- Antipodal (halfway around world) = ~200 ms RTT minimum

This is a floor. Real RTT is always larger (router hops, switching, queueing, congestion). No application can go faster than physics allows.

Triangle Inequality

For three nodes A, B, C, the triangle inequality says d(A,C) <= d(A,B) + d(B,C): a direct path is shorter than (or equal to) any indirect path.

Network reading: if your service routes A -> B -> C instead of A -> C directly, the latency is at least the sum of two leg latencies. Often more due to processing at B.

Architectural reading: every indirection (proxy, load balancer, CDN hop) adds at least one round-trip leg to the user's perceived latency. CDN benefit comes from making the user's leg shorter (PoP closer than origin), even though the total hop count rises.

Multi-region traps: a service that reads from region A but writes to region B incurs A-to-B latency on every write. If A & B are 100 ms apart, every write takes >= 100 ms minimum. Stretched databases pay this floor every time.

Latency triangle: A-B-C floor set by physical distance

Pay the Floor

A service runs in two regions: US-East (us-east-1) & EU-West (eu-west-1). The two regions are roughly 5,500 km apart. The service has a primary database in US-East. EU users have their requests served by EU-West backends, but every write requires a call back to the US-East primary.

Compute the latency floor for an EU user write (round-trip from their browser to EU-West backend to US-East primary & back). Compare to an EU user read served entirely from EU-West cached state. Then propose one architectural change that reduces the write latency floor for EU users.

Geographic Capacity Design

Synthesis

You can now read Voronoi cells as PoP catchments, compute speed-of-light latency floors, & apply the queueing curve at the proxy tier.

Apply all three.

A team plans CDN coverage for a service with users on three continents: North America (60% of users), Europe (30%), Asia (10%). They have budget for 6 PoPs. Each PoP can serve a steady cell at 70% utilization without crossing the queueing curve knee.

Design the PoP placement: (1) how would you distribute 6 PoPs across the three continents, (2) for the smallest user share (Asia at 10%), what is the latency floor for an Asia user being served from a Europe PoP if no Asia PoP exists (assume ~9000 km distance), & (3) what capacity headroom would you require per PoP to survive a single-PoP failure without cascading?

Closing the Companion Course

Closing the Companion Course

You have completed all five geometry-of-* companion lessons:

- Proxies & Origins: directed graphs, hop count, fan-in / fan-out, indirection

- Stateless Horizontal Scaling: Little's Law as area, queueing curve & its knee

- Ingress & Egress Separation: bipartite structure, cut vertex elimination, partition tolerance

- Failure Modes & Blast Radius: betweenness centrality, min-cut, diameter

- Observability & Capacity (this one): Voronoi PoP cells, latency triangle floor, geographic capacity design

The throughline: distributed systems have geometric structure. Every architecture is a graph. Every latency floor is a triangle inequality. Every capacity decision is a curve & a knee. Once you see the geometry, the operational decisions follow from it.

Combined with the five main lessons (cs_distsys_*), you have a working mental model of a web-scale distributed system & the graph-theoretic discipline to reason about it.

Well done.