A Content Delivery Network is the reason a website feels instant whether you load it from Tokyo or Toronto. It's a globally distributed fleet of caching servers that keep copies of your content physically close to users, so requests are served from a nearby edge instead of crossing oceans to your origin. CDNs appear in nearly every system design on this site — Drive, video streaming, Pastebin — as the layer that absorbs read traffic. Here's how they actually work.
- Two wins: latency and offload — content served from a nearby edge is faster (less distance), and most requests never reach your origin (less load).
- Edge PoPs + anycast/GeoDNS route each user to the closest point of presence automatically.
- Cache behavior is controlled by HTTP —
Cache-Control/TTL decide what's cacheable and for how long; the cache key decides what counts as "the same" object. - Hit vs miss — a hit is served at the edge; a miss is pulled from origin, cached, then served.
- Pull (lazy) vs push — pull CDNs fetch on first request; push CDNs pre-load content.
- Invalidation is the hard part — use TTLs, explicit purges, or versioned/fingerprinted URLs (cache busting).
- Bonus: security — CDNs absorb DDoS, terminate TLS, and host a WAF at the edge.
A CDN caches your content on edge servers worldwide and routes each user to the nearest one (anycast/GeoDNS), cutting latency and shielding the origin. HTTP Cache-Control headers and the cache key govern caching; a cache hit is served from the edge, a miss is pulled from origin and cached. The hard problem is invalidation — solved with TTLs, purges, or versioned URLs. Static assets are trivial to cache; dynamic content uses edge compute and route optimization.
Why a CDN: Latency and Offload
Two physics-and-economics facts drive the whole design. First, distance is latency: data travels at a finite speed, so a round trip from Sydney to a server in Virginia is fundamentally slow no matter how fast that server is. Putting a copy 20 km from the user instead of 16,000 km away slashes round-trip time. Second, offload: if the edge serves most requests from cache, your origin sees a fraction of the traffic — a viral page or a popular video is served almost entirely from edges, so the origin doesn't melt. Latency for users, survival for your origin — that's the pitch.
How It Works: Edge PoPs and Routing
A CDN operates points of presence (PoPs) — clusters of caching servers — in data centers around the world. The trick is getting each user to the nearest PoP automatically, done two main ways: anycast (many PoPs announce the same IP address, and internet routing naturally delivers the user to the topologically closest one) and GeoDNS (DNS resolves the CDN hostname to a different edge IP based on the resolver's location — see our DNS article). Either way, the user connects to a close edge without knowing it.
user → nearest edge PoP
├─ HIT → serve from edge cache (fast, origin untouched)
└─ MISS → fetch from origin → cache it → serve
(subsequent requests in this region are now HITs)
goal: maximize hit ratio → most traffic served at the edge
Cache Behavior: Control, TTL, and Keys
What the edge caches, and for how long, is governed by HTTP headers from the origin. Cache-Control: max-age=3600 says "cacheable for an hour"; no-store says "never cache"; private means browser-only (not shared caches). stale-while-revalidate lets the edge serve a slightly stale copy while it refreshes in the background — great for keeping responses fast.
Cache-Control: public, max-age=31536000, immutable # fingerprinted asset, cache 1y
Cache-Control: public, max-age=60, stale-while-revalidate=300
Cache-Control: no-store # never cache (user-specific)
Equally important is the cache key — what the CDN uses to decide whether two requests are "the same object." By default it's the URL, but it can include query strings, headers, or cookies. Get this wrong and you either cache too coarsely (serve the wrong variant) or too finely (every request is a unique key → near-zero hit ratio). High cache-key cardinality is a classic CDN footgun.
Push vs Pull CDNs
| Aspect | Pull CDN | Push CDN |
|---|---|---|
| How content arrives | Lazily, on first request (miss → origin) | You upload/pre-load it to the CDN |
| First request | Slower (cache miss) | Fast (already at edge) |
| Best for | Large, frequently-changing catalogs | Big files, predictable hot content |
| Effort | Low — just point DNS at the CDN | Higher — manage uploads/expiry |
Pull is the common default (set it and forget it); push makes sense for large media you know will be hot and don't want a slow first miss on.
Invalidation: the Hard Part
"There are only two hard things in computer science: cache invalidation and naming things." Once content is cached at hundreds of edges, updating it is genuinely tricky. Three strategies, in increasing precision:
- TTL expiry — just wait for
max-ageto lapse. Simple, but you tolerate staleness up to the TTL. - Explicit purge — tell the CDN to evict a URL (or tag/prefix) now. Precise but propagation across all edges isn't instant.
- Versioned / fingerprinted URLs (cache busting) — the best pattern for static assets: include a content hash in the filename (
app.3f9a1c.js) and cache it forever; a change produces a new URL, so there's nothing to invalidate.
For static assets, combine immutable + fingerprinted URLs: cache app.<hash>.js for a year, and reference the new hash from a short-TTL HTML page. You get maximal caching with zero invalidation — deploys just point to new filenames. This is why build tools hash asset names.
Static vs Dynamic Content
Static content — images, CSS/JS, video segments, downloads — is the CDN's bread and butter: it's identical for everyone and cacheable for a long time. Dynamic content (personalized pages, API responses) is harder because it varies per user, but CDNs still help via dynamic acceleration: keeping warm, optimized connections from edge to origin (TLS already negotiated, better routes over the CDN backbone), and increasingly edge compute — running code at the PoP to personalize, assemble, or cache fragments close to the user. The line between "CDN" and "edge platform" is blurring for exactly this reason.
Security Bonus
Because the CDN sits in front of your origin and absorbs all inbound traffic, it's a natural security layer: it absorbs DDoS attacks across its enormous capacity before they reach you, terminates TLS at the edge (faster handshakes near the user), and often hosts a WAF (web application firewall) and bot mitigation. For many sites the DDoS protection alone justifies the CDN.
Use Cases and Pitfalls
- Static asset & media delivery — the classic; pair with object storage as origin.
- Video streaming — cache segments at the edge for scale (see video streaming design).
- Pitfall — cache stampede: when a popular object expires, many edges miss at once and hammer the origin; mitigate with staggered TTLs,
stale-while-revalidate, or origin request coalescing. - Pitfall — caching the wrong thing: caching personalized or auth'd responses leaks data between users; mark them
private/no-storeand key carefully. - Pitfall — low hit ratio from high cache-key cardinality (unnecessary query strings/cookies in the key).
A CDN trades a little complexity (cache headers, invalidation) for two big wins: lower latency (content near users) and origin offload (most requests never reach you). Route users to the nearest edge with anycast/GeoDNS, control caching with Cache-Control and good cache keys, prefer fingerprinted immutable URLs to dodge invalidation, and lean on the edge for security too.
What does a CDN buy you? Lower latency (serve from a nearby edge) and origin offload (most reads served from cache, so the origin survives spikes).
How are users routed to the nearest edge? Anycast (many PoPs share one IP; routing picks the closest) or GeoDNS (DNS returns a region-specific edge IP).
Push vs pull CDN? Pull caches lazily on first request (easy, slow first miss); push pre-loads content (fast first hit, more management).
How do you invalidate cached content? TTL expiry, explicit purge, or — best for static — fingerprinted immutable URLs so a change is a new URL with nothing to invalidate.
What's a cache stampede? Many edges miss simultaneously when a hot object expires and overload the origin; fix with stale-while-revalidate, jittered TTLs, or request coalescing.