How do you paginate a feed that keeps changing as new posts arrive?

Use cursor-based pagination keyed on a stable, monotonic value (post ID or a snapshot timestamp), never SQL OFFSET. The cursor encodes the position of the last item served; new posts that arrive above it do not shift the page boundaries, so the user never sees duplicates or skips while scrolling.

Design a News Feed (Twitter) — System Design

Q: Fan-out-on-write or fan-out-on-read — which do you choose?

Neither alone. Fan-out-on-write (push) precomputes every follower's timeline at post time — fast reads, but write amplification explodes for high-follower accounts. Fan-out-on-read (pull) merges followees' posts at read time — cheap writes, but slow reads for users who follow thousands. Production systems use a hybrid: push for normal users, pull for celebrities, merged at read time.

Q: How do you handle the celebrity / hot-key problem?

A post from an account with 100M followers under pure push would trigger 100M timeline writes — a write storm that lags fan-out by minutes. Mark high-follower accounts as celebrities and skip their fan-out entirely. At read time, fetch the user's pushed timeline and merge in recent posts pulled directly from the celebrities they follow. The celebrity's own posts live in a hot cache shared by all readers.

Q: What exactly is hybrid fan-out?

Classify authors by follower count. Below a threshold (say ~10k–100k followers) posts are pushed into each follower's precomputed timeline cache. Above the threshold, posts are not fanned out; they are read-pulled and merged into the feed at request time. This bounds write amplification while keeping the common read path a single cache lookup.

Q: Why store post IDs in the timeline cache instead of full posts?

Full posts are large, mutable (edit, like count, delete), and duplicated across millions of followers' timelines. Storing only post-ID pointers (plus a ranking score) keeps each timeline entry ~20 bytes, caps memory, and lets you hydrate the latest post content and counts at read time from a shared post store — so an edit or delete is reflected without rewriting every timeline.

A news feed — the ranked or reverse-chronological stream of posts from every account a user follows — is the canonical "fan-out" system design problem, and it is deceptively hard. The interface is trivial (scroll a list of posts), but the data shape is brutal: a single post by one author must reach the home timeline of every follower, and a single feed-load must merge fresh posts from hundreds or thousands of followees, ranked, paginated, and rendered in under 200 ms. The hard part is not storing posts; it is the write amplification of fan-out and its pathological worst case — an account with 100 million followers posting once. Get the fan-out strategy wrong and you either melt the write path (push to 100M timelines) or the read path (pull and merge thousands of authors on every refresh). This article works through the canonical design: fan-out-on-write vs. fan-out-on-read, the celebrity hot-key problem, the hybrid model production systems actually run, the Redis timeline cache, feed ranking, and stable pagination.

⚡ Quick Takeaways

Two timelines, not one — the user timeline (one author's own posts) and the home timeline (aggregated from everyone you follow) have completely different fan-out costs. The home timeline is the hard one.
Fan-out-on-write (push) — precompute each follower's timeline at post time. Reads are a single cache read; writes amplify by follower count.
Fan-out-on-read (pull) — assemble the feed at read time by merging followees' recent posts. Writes are O(1); reads are expensive for users who follow thousands.
Celebrity hot-key — pure push on a 100M-follower account is a write storm; pure pull on a user who follows 5,000 accounts is a read storm. Both extremes break.
Hybrid fan-out wins — push for normal authors, pull for celebrities, merge at read time. Industry standard (Twitter, Instagram).
Cache post-ID pointers, not posts — each timeline entry is ~20 bytes (post ID + score); hydrate content and counts at read time so edits/deletes don't rewrite millions of timelines.
Cursor pagination — snapshot/cursor keyed on post ID, never SQL OFFSET, so the feed is stable while new posts stream in above the cursor.
Async fan-out via a queue — the post write returns immediately; Kafka-backed workers do the heavy timeline inserts off the hot path.

tldr

Separate the user timeline from the home timeline. Use hybrid fan-out: push posts from normal authors into each follower's Redis timeline (a capped list of post-ID pointers), but skip fan-out for celebrities and pull their recent posts at read time, merging them into the feed. Rank the merged candidate set, hydrate post content from a shared store, and paginate with a stable cursor. Drive fan-out asynchronously through Kafka so the post write never blocks on follower count.

Step 1 — Clarify Requirements

The news feed prompt is broad. Scope it explicitly before drawing anything — the single most important clarification is which feed you are building, because the user timeline and the home timeline have opposite cost profiles.

Functional requirements

Publish a post (text, optionally media references) to your followers.
Follow / unfollow other accounts (the social graph).
View your home timeline — a feed aggregating recent posts from all accounts you follow.
View a user timeline — the posts authored by one specific account, in reverse-chronological order.
Ranking: support both reverse-chronological and a ranked ("top posts") ordering.
Engagement: like, reply, repost — with approximate counts shown on each post.
Pagination / infinite scroll — load the feed in pages as the user scrolls.

Non-functional requirements

Read-heavy — feed loads vastly outnumber posts; target <200 ms p99 to render a page of the home timeline.
High availability — the feed must always render something; a stale-but-present feed beats an error. Lean AP over CP.
Eventual consistency is acceptable — a new post appearing in followers' feeds a few seconds late is fine; a post must never be lost.
Scalability — hundreds of millions of DAU, a power-law follower graph (most accounts tiny, a few with 100M+ followers).

interview tip

Lead with the distinction between the user timeline and the home timeline. The user timeline is a simple per-author query (shard by author, sort by time). The home timeline is the aggregation problem — fan-out, ranking, hot keys — and is where 90% of the interview lives. Naming this split early signals you understand where the difficulty actually is.

Step 2 — Capacity Estimation

Back-of-the-envelope numbers anchor the fan-out discussion. Assume a large social platform: 200M DAU, average user follows ~200 accounts and has ~200 followers (the average; the distribution is heavily power-law).

Traffic

Posts: assume ~0.2 posts/DAU/day → 200M × 0.2 = 40M posts/day ≈ 460 posts/sec average; peak ~3–5× → ~2,000 posts/sec.
Feed reads: assume each DAU refreshes ~10×/day → 200M × 10 = 2B feed reads/day ≈ 23,000 reads/sec average; peak ~5× → ~100,000 reads/sec.
Post-level read:write ratio ≈ 50:1 — squarely read-heavy, which biases us toward precomputing the read (fan-out-on-write).

Fan-out write amplification (the crux)

Under pure fan-out-on-write, each post inserts into every follower's timeline: 40M posts × ~200 avg followers = 8B timeline inserts/day ≈ 93,000 fan-out writes/sec average.
Peaks are far worse than the average implies, because the average hides the tail: a single celebrity post (100M followers) is 100M inserts — more than a full day's average fan-out from one write.
This is why pure push cannot stand alone: the write amplification is unbounded in the number of followers, and the follower distribution has a very fat tail.

Storage

Post record: ~300 bytes (post_id 8 B, author_id 8 B, text ≤280 chars, timestamps, counts). 40M/day × 300 B ≈ 12 GB/day ≈ 4.4 TB/year of post text — fits a sharded store easily; media lives in object storage behind a CDN.
Timeline cache: store post-ID pointers only — ~20 B/entry (8 B post ID + 8 B score + overhead), capped at ~800 entries/user. 200M users × 800 × 20 B ≈ ~3.2 TB across the Redis fleet (sharded).

note

The post-text storage is small and boring (a few TB/year). The expensive, design-defining number is the 93,000+ fan-out writes/sec and its tail. Every architectural decision below — hybrid fan-out, hot-key handling, capped timelines, async workers — exists to tame that one number.

Step 3 — API Design

A small REST surface. The two endpoints that matter for scale are creating a post and reading the home timeline.

HTTP

# Publish a post
POST /api/posts
     Authorization: Bearer <token>
     { "text": "shipping the feed redesign 🚀",
       "media_ids": ["m_91af"] }      // optional
     → 201 { "post_id": "189f3c2a01", "created_at": "2026-06-28T10:00:00Z" }

# Read the home timeline (the hard one) — cursor paginated
GET /api/feed?limit=20&cursor=189f3c2a01
     Authorization: Bearer <token>
     → { "items": [ {post}, {post}, ... ],
         "next_cursor": "189f2b88f0" }   // null when exhausted

# Read a single user's timeline (per-author, simple)
GET /api/users/{id}/posts?limit=20&cursor=...

# Social graph
POST   /api/follow    { "followee_id": "u_42" }   → 204
DELETE /api/follow    { "followee_id": "u_42" }   → 204

# Engagement (counts updated asynchronously)
POST   /api/posts/{id}/like   → 204

The GET /api/feed response carries an opaque next_cursor rather than a page number — the client passes it back to fetch the next page. The cursor encodes the position of the last item served (typically the lowest post ID on the page), which makes pagination stable even as new posts arrive at the top. We return hydrated post objects (text, author, current like/reply counts) assembled at read time from the shared post store, not whatever was cached when the post was fanned out.

Step 4 — Data Model

Three core stores: the post store, the social graph, and the per-user timeline cache. They have very different access patterns and are sharded differently.

SQL

-- POSTS: source of truth for content. Sharded by post_id (Snowflake).
CREATE TABLE posts (
  post_id     BIGINT      PRIMARY KEY,   -- Snowflake: time-sortable
  author_id   BIGINT      NOT NULL,
  text        VARCHAR(280),
  media_ids   JSON,
  created_at  TIMESTAMP   NOT NULL,
  like_count  BIGINT      DEFAULT 0,    -- approximate, async
  reply_count BIGINT      DEFAULT 0,
  is_deleted  BOOLEAN     DEFAULT FALSE
);

-- FOLLOWS: the social graph. Sharded by follower_id for "who do I follow",
-- with a secondary index / mirror table on followee_id for "who follows X".
CREATE TABLE follows (
  follower_id BIGINT      NOT NULL,
  followee_id BIGINT      NOT NULL,
  created_at  TIMESTAMP   NOT NULL,
  PRIMARY KEY (follower_id, followee_id)
);
CREATE INDEX idx_followee ON follows(followee_id);  -- fan-out target list

-- AUTHOR STATS: drives the hybrid fan-out decision.
CREATE TABLE author_stats (
  author_id      BIGINT   PRIMARY KEY,
  follower_count BIGINT   DEFAULT 0,
  is_celebrity   BOOLEAN  DEFAULT FALSE   -- follower_count > threshold
);

-- TIMELINE CACHE (Redis, not SQL): per-user capped list of post-ID pointers.
-- Key:  timeline:{user_id}   Value: ZSET { post_id : score }
-- Capped to ~800 newest entries via ZADD + ZREMRANGEBYRANK.

Key decisions: post_id is a Snowflake ID so it is globally unique and time-sortable — sorting a timeline is just sorting integers, and the ID doubles as the pagination cursor. The follows table needs two access paths: "who do I follow" (read path, sharded by follower_id) and "who follows X" (fan-out target list, indexed by followee_id); at scale this becomes two physically separate, independently sharded representations of the same edge. The timeline cache lives in Redis as a sorted set of post-ID pointers — never full post bodies.

Step 5 — Fan-out-on-Write vs. Fan-out-on-Read

This is the central decision of the whole design. When user A posts, how do A's followers eventually see it in their home timelines? Two pure strategies sit at opposite ends.

Fan-out-on-write (push model)

At post time, look up all of A's followers and push the new post ID into each follower's precomputed timeline cache. Reading a home timeline is then a single cache read: ZREVRANGE timeline:{me} 0 19. The read is O(1) in the number of accounts you follow — it has already been materialized for you.

Fan-out-on-read (pull model)

At post time, do nothing but store the post in the author's own timeline. When a follower loads their home feed, pull the recent posts of every account they follow and merge-sort them on the fly. The write is O(1); the read is O(number of followees), and gets brutal for users who follow thousands of accounts — each feed load fans out reads across many authors and shards.

Dimension	Fan-out-on-write (push)	Fan-out-on-read (pull)
Read cost	O(1) — single cache read of a ready timeline	O(followees) — merge many authors per load
Write cost	O(followers) — insert into every follower	O(1) — store once
Storage	High — post ID duplicated across all timelines	Low — one copy per post
Worst case	Celebrity post = write storm	User following 5k accounts = read storm
Freshness	Lags by fan-out delay (seconds)	Always current at read time
Best for	Normal authors, read-heavy feeds	Celebrities, inactive users

Because the workload is ~50:1 read-heavy, the default instinct is push — precompute the expensive thing (the read) and pay at write time. That works beautifully until you hit an account with tens of millions of followers, where the write cost becomes catastrophic. Neither pure model survives the real follower distribution; the production answer is to combine them.

interview tip

Don't pick a side. State the read:write ratio, explain why push is the default for a read-heavy feed, then immediately introduce the celebrity counter-example that breaks pure push and motivates the hybrid. Walking the interviewer from "push" → "but celebrities" → "so, hybrid" is exactly the reasoning arc they're listening for.

Step 6 — The Celebrity / Hot-Key Problem

The follower graph is power-law: the overwhelming majority of accounts have a few hundred followers, but a handful have 50M–500M. Pure fan-out-on-write breaks on these accounts in two distinct ways:

Write storm — one celebrity post triggers tens of millions of timeline inserts. Even fanned out asynchronously across worker pools, this floods the queue, delays fan-out for everyone else behind it, and can take minutes to drain — so the post lands in some followers' feeds long after others.
Hot key on the post — the celebrity's post is read by millions of people within seconds. The single post record (and its like/reply counters) becomes a hot key, hammering one shard of the post store.

There is also a thundering-herd interaction: several celebrities posting in the same window can saturate the entire fan-out tier. The fix is to stop fanning out celebrity posts entirely and handle them on the read side, plus give the hot post record its own caching treatment.

scalability note

The hot-key problem is the same shape that appears in a cache for any viral object: a single key receives a disproportionate share of traffic and overwhelms its shard. The mitigations rhyme — replicate the hot key across multiple cache nodes, add an in-process LRU layer in front of the shared cache (1–5 s TTL) to absorb the spike, and serve approximate counters so the write-back of likes/replies doesn't serialize on one row.

Step 7 — Hybrid Fan-out

The production answer — used by Twitter, Instagram, and essentially every large feed — is hybrid fan-out: choose push or pull per author based on follower count.

Normal authors (below the threshold) — push. Their posts are fanned out into each follower's timeline cache at write time. Most authors, and therefore most posts, take this path.
Celebrities (above the threshold) — do not fan out. Their posts are stored only in their own user timeline. The few accounts above the threshold contribute zero fan-out writes.
Read-time merge — when a user loads their home feed, the system reads their pushed timeline cache (covering all the normal authors they follow) and pulls the recent posts of the celebrities they follow, then merges and ranks the two sets.

The threshold (often somewhere around 10k–100k followers, tuned empirically) trades write amplification against read-merge cost. The number of celebrities any single user follows is small — you might follow a few hundred normal accounts and a dozen celebrities — so the read-time pull is bounded: a dozen cheap "recent posts by author X" lookups, each itself cache-friendly because that celebrity's recent posts are read by millions and stay hot.

Python

# WRITE PATH — decide push vs. skip per author
def on_new_post(post):
    store_post(post)                          # source of truth
    add_to_user_timeline(post.author_id, post.post_id)
    if is_celebrity(post.author_id):          # follower_count > threshold
        return                                # skip fan-out — pulled at read time
    # normal author: enqueue async fan-out to followers
    publish("fanout", { "post_id": post.post_id, "author_id": post.author_id })

# READ PATH — merge pushed timeline + pulled celebrity posts
def home_feed(me, cursor, limit):
    pushed   = zrevrange(f"timeline:{me}", cursor, limit)       # O(1) cache read
    celebs   = get_followed_celebrities(me)                  # small set
    pulled   = [recent_posts(c, cursor, limit) for c in celebs] # cache-hot
    merged   = merge_by_score(pushed, flatten(pulled))
    return hydrate(rank(merged)[:limit])                     # fetch content + counts

One subtlety: the threshold is not purely about follower count but about the cost of fan-out. An account that posts rarely is cheap to push even with many followers; an account that posts constantly is expensive even with fewer. Mature systems factor in posting velocity, and some even decide push-vs-pull per active-follower count (only fan out to followers who have been online recently), which dramatically shrinks the fan-out set for accounts with many dormant followers.

Step 8 — Timeline Cache (Redis)

The pushed home timeline lives in Redis as a per-user sorted set of post-ID pointers, scored for ordering. This is the structure that makes the read path a single O(log n) range query.

Pointers, not posts — store { post_id : score }, ~20 bytes/entry. Never store full post bodies: they're large, they change (likes, edits, deletes), and they'd be duplicated across millions of timelines. Content is hydrated at read time from the shared post store.
Capped length — keep only the newest ~800 entries per user (ZADD then ZREMRANGEBYRANK timeline:{u} 0 -801). Nobody scrolls back 800 posts; deeper history falls back to a slower fan-out-on-read path. Capping is what bounds the 3.2 TB fleet number from the estimation.
Score = ranking signal or timestamp — for chronological feeds the score is the Snowflake post ID itself (time-sortable). For ranked feeds the score is a precomputed rank, or the cache stays chronological and ranking happens after the read.
Cold / inactive users — don't maintain timelines for users who haven't logged in for weeks. Skip fanning out to them and rebuild their timeline lazily on next login (pure pull). This alone removes a large fraction of wasted fan-out writes.

Redis

# Fan-out worker inserts a post pointer into a follower's timeline
ZADD   timeline:u_42  189f3c2a01  189f3c2a01     # score = post_id (time-sortable)
ZREMRANGEBYRANK timeline:u_42  0  -801           # cap to newest 800

# Read path: newest 20, paginated by cursor (a post_id)
ZREVRANGEBYSCORE timeline:u_42  (189f2b88f0  -inf  LIMIT 0 20

Because timelines are sharded across the Redis fleet by user ID, fan-out writes spread evenly across shards — except for the hot post records themselves, which is exactly why celebrities are pulled (their post is one hot key read by millions) rather than pushed (their fan-out would write to millions of cold timelines).

Step 9 — Feed Ranking

A reverse-chronological feed is the simple default, but most large feeds are ranked — ordered by predicted relevance, not just recency. Ranking is layered on top of the fan-out machinery as a two-stage pipeline:

Candidate generation — the merged set from hybrid fan-out (pushed timeline + pulled celebrity posts) is the candidate pool, typically a few hundred recent posts. Fan-out's job is to produce candidates cheaply; it is not the final order.
Scoring / ranking — each candidate is scored by a model combining signals: affinity (how much you interact with the author), recency (time decay), engagement (likes/replies velocity), content type, and predicted P(like), P(reply), P(dwell). The top N by score become the page.

A simple, interview-ready scoring function captures the idea without invoking ML infrastructure:

Python

def score(post, viewer):
    affinity   = edge_weight(viewer, post.author_id)   # past interactions
    recency    = exp(-AGE_DECAY * hours_since(post.created_at))
    engagement = log1p(post.like_count + 2 * post.reply_count)
    return W1 * affinity + W2 * recency + W3 * engagement

Ranking introduces a freshness-vs-relevance tension: a purely ranked feed can bury brand-new posts (low engagement so far) and feel stale, so production systems blend in a recency floor or reserve slots for fresh content. Ranking models are served from a separate online inference tier; precomputed features (affinity edges, author stats) are cached so scoring a few hundred candidates stays within the latency budget.

note

Keep ranking after candidate generation, not inside fan-out. If you bake the final rank into the pushed timeline's score, you can't re-rank when the model or the viewer's context changes, and every like would have to rewrite millions of timeline scores. Fan-out produces candidates; ranking orders them at (or near) read time.

Step 10 — Pagination

Infinite scroll on a feed that is constantly receiving new posts at the top is a classic correctness trap. The naive approach — LIMIT 20 OFFSET 40 — is broken here: if three new posts arrive while the user reads page 1, the offset shifts and page 2 repeats items the user already saw (or skips some).

The fix is cursor-based pagination keyed on a stable, monotonic value — the Snowflake post_id. The cursor is the ID of the last item on the current page; the next page asks for items with an ID strictly less than the cursor:

Stable boundaries — new posts arriving above the cursor don't shift the page below it. The user paginates downward through a consistent, ever-older sequence regardless of what arrives up top.
Cheap on Redis — ZREVRANGEBYSCORE timeline:{u} (cursor -inf LIMIT 0 20 is an O(log n + page) range scan, not an offset walk.
"Show new posts" pill — fresh posts above the user's session anchor are surfaced via a separate "N new posts" control that prepends them on demand, rather than silently reflowing the current scroll position.
Snapshot semantics for ranked feeds — when order isn't a single monotonic field, the cursor encodes a session snapshot (a server-side materialized page sequence or a (score, post_id) tuple) so a re-rank mid-scroll doesn't duplicate or drop items.

Step 11 — Core Architecture & Read/Write Paths

Tying it together: an async, queue-driven write path and a merge-and-hydrate read path.

flow

-- WRITE PATH (publish a post)
POST /api/posts { text, media_ids? }
  → Post service: write post to post store (sharded by post_id)
  → Append post_id to author's own user timeline
  → Lookup author_stats.is_celebrity
       celebrity  → STOP (no fan-out; pulled at read time)
       normal     → publish { post_id, author_id } to Kafka "fanout" topic
  → Return 201 immediately   (fan-out happens off the hot path)

Fan-out workers (consume Kafka "fanout")
  → Load follower_ids for author  (skip inactive followers)
  → For each follower: ZADD timeline:{follower} score post_id
                       ZREMRANGEBYRANK timeline:{follower} 0 -801

-- READ PATH (load home timeline)
GET /api/feed?cursor&limit
  → Read pushed timeline:  ZREVRANGEBYSCORE timeline:{me} (cursor -inf
  → Pull celebrity posts:  recent_posts(c) for each followed celebrity
  → Merge + rank candidate set
  → Hydrate: batch-fetch post content + live like/reply counts
             from post store (mget, cache-fronted)
  → Filter: drop deleted / blocked / muted
  → Return { items, next_cursor }

The post write returns in milliseconds because it never waits on fan-out — fan-out is decoupled through Kafka, which buffers bursts (a celebrity-free spike of normal posts) and lets the worker tier scale independently. This is the same async-pipeline reasoning used to keep any hot write path fast: do the minimum synchronously, enqueue the amplification, and let consumers absorb it with their own back-pressure. Hydration at read time means a like that landed one second ago is reflected in the count, even though the timeline pointer was written hours earlier.

Step 12 — Scaling & Sharding

Each tier scales independently because the stores are sharded by different keys.

Sharding the stores

Post store — shard by post_id (Snowflake). Reads are point lookups by ID; hydration batches IDs and scatter-gathers across shards. Time-sortable IDs keep a single author's posts roughly co-located in time.
Timeline cache — shard Redis by user_id. Fan-out writes and feed reads both hash on the user, so a single user's timeline is one shard, one round trip.
Social graph — shard follows by follower_id for the read-path "who do I follow", and maintain a separate followee-indexed representation for the fan-out "who follows X" lookup. The two queries have opposite access keys, so they get opposite shard keys.

Scaling fan-out

Fan-out workers are stateless Kafka consumers — add partitions and consumers to scale throughput. Partition the fan-out topic by author_id so a single hot author's work stays ordered, and rate-limit / shard the work for large (but sub-celebrity) authors so one mid-tier influencer doesn't head-of-line-block the partition. The celebrity cutoff is the pressure-release valve that keeps the very fat tail off this tier entirely.

Read scaling

The feed service is stateless behind a load balancer; scale horizontally. The timeline-cache read absorbs the vast majority of read load. Hot post records (celebrity posts, viral posts) get an extra in-process LRU layer plus cross-shard replication so a single viral post doesn't saturate one Redis node.

Step 13 — Fault Tolerance & Edge Cases

The feed is allowed to be eventually consistent, which gives a lot of slack — but the edge cases are where correctness bugs hide.

Deleted posts — don't chase down millions of timeline entries to remove a deleted post. Leave the pointer; at read time, hydration skips posts flagged is_deleted (tombstone filter). The stale pointer ages out of the capped timeline naturally.
Unfollow — same idea: don't scrub the unfollowed author's posts from the timeline cache. Filter them out at read time against the current follow set, or simply let them age out. Correctness on the next rebuild; no expensive immediate cleanup.
Blocks / mutes — applied as a read-time filter on the candidate set, never baked into fan-out (the block can be added after the post was already fanned out).
New-user cold start — a brand-new account follows nobody, so its pushed timeline is empty. Backfill by pulling recent posts from initial follows, and blend in a "recommended"/discovery feed until the personalized feed has signal.
Out-of-order fan-out — async workers may insert a celebrity-adjacent post late; because timelines are scored by time-sortable post ID, a late insert still lands in the correct chronological position rather than at the top.
Redis failure — a lost timeline shard is recoverable: timelines are a cache, not the source of truth. Rebuild a user's timeline on demand by pulling from the post store and social graph (the fan-out-on-read fallback). Run Redis with replicas to shrink the blast radius.
Fan-out lag — during a fan-out backlog, a normal author's post may reach followers seconds-to-minutes late. Acceptable by the eventual-consistency requirement; monitor queue depth and autoscale workers before lag becomes visible.

consistency note

The unifying principle across these edge cases: the timeline cache is disposable and approximate; the post store and social graph are the source of truth. Anything that must be correct (deletes, blocks, unfollows, live counts) is enforced at read-time hydration/filtering, not by mutating millions of cached pointers. This keeps writes cheap and makes the cache safe to rebuild at any time.

Step 14 — Key Tradeoffs

Decision	Choice	Trade-off accepted
Fan-out strategy	Hybrid (push normal, pull celebrity)	Two code paths + a read-time merge, in exchange for bounded write amplification
Timeline storage	Post-ID pointers in Redis, capped ~800	Deep history needs a slower pull fallback; content hydrated separately
Fan-out execution	Async via Kafka workers	Posts appear in feeds with seconds of lag (eventually consistent)
Ordering	Ranked (model) over pure chronological	More infra + a freshness floor to avoid burying new posts
Pagination	Cursor on Snowflake post_id	No random page access; designed for forward infinite scroll
Deletes / unfollows	Read-time filter (tombstones)	Stale pointers linger in cache until they age out
Counts	Approximate, async	Like/reply counts can lag by seconds; avoids hot-row contention

takeaway

News feed design is the study of taming write amplification. Push precomputes the read and wins for the read-heavy common case; pull saves the write path for the fat tail of celebrities; hybrid fan-out is simply the recognition that the follower distribution forces you to do both. Everything else — pointer-only capped timelines, async Kafka fan-out, read-time hydration and filtering, cursor pagination, ranking after candidate generation — exists to keep one number (fan-out writes/sec and its tail) under control while the feed still renders fresh, correct, and fast.

🎯 interview hot-takes

Fan-out-on-write or fan-out-on-read — which do you choose? Neither alone. Push precomputes each follower's timeline for fast reads but amplifies writes by follower count; pull is cheap to write but slow to read for users who follow thousands. Production uses a hybrid: push for normal users, pull for celebrities, merged at read time.
How do you handle the celebrity / hot-key problem? A 100M-follower post under pure push is a write storm that lags fan-out for minutes. Mark high-follower accounts as celebrities, skip their fan-out, and pull their recent posts at read time. The celebrity's own post lives in a hot cache shared by all readers.
What exactly is hybrid fan-out? Classify authors by follower count: below a threshold, push into each follower's timeline cache; above it, don't fan out — read-pull and merge at request time. This bounds write amplification while keeping the common read path a single cache lookup.
Why store post IDs in the timeline cache instead of full posts? Posts are large, mutable, and duplicated across millions of timelines. Storing ~20-byte post-ID pointers caps memory and lets you hydrate current content/counts at read time, so an edit or delete needs no rewrite of every timeline.
How do you paginate a feed that keeps changing? Cursor-based pagination keyed on a monotonic post ID, never SQL OFFSET. The cursor marks the last item served, so posts arriving above it don't shift page boundaries — no duplicates or skips while scrolling.