What is the difference between fan-out on write and fan-out on read?

Fan-out on write (push) inserts a new post into every follower's precomputed feed at posting time, making feed reads trivially fast but making each post expensive in proportion to the follower count. Fan-out on read (pull) stores the post once and assembles the feed by querying all followed accounts when the user opens the app, making writes cheap but reads slow. Because feed reads vastly outnumber posts, push is the better default - except for accounts with enormous follower counts.

How do you handle celebrity accounts in a news feed?

A celebrity with tens of millions of followers would trigger tens of millions of feed writes from a single post under pure fan-out on write. The fix is a hybrid model: use fan-out on write for normal accounts, but skip it for accounts above a follower threshold. When a user opens their feed, the system reads their precomputed feed and additionally pulls recent posts from the handful of celebrities they follow, then merges the two.

How is a news feed ranked?

A ranked feed scores each candidate post on signals such as recency, the viewer's affinity with the author, and predicted engagement, then orders by score rather than purely by time. Ranking is applied at read time over a bounded candidate set - the precomputed feed entries plus any celebrity pulls - so the score can reflect fresh engagement data. A chronological feed skips this and simply sorts by timestamp.

Why use cursor-based pagination for a feed instead of offset?

A feed's head constantly shifts as new posts arrive, so offset-based pagination (skip N, take M) returns duplicates and gaps as items move between pages. Cursor-based pagination encodes the position of the last seen item - its score and ID - and asks for items after that point, which stays correct regardless of new insertions at the head.

Is the precomputed feed the source of truth?

No. The precomputed feed is derived data - it can always be rebuilt by pulling recent posts from the accounts a user follows. The sources of truth are the posts store and the social graph. This means a lost or corrupted feed store degrades the system to pull-based assembly rather than losing data, which is why the feed store can be treated as a rebuildable cache.

How fast must a new post appear in followers' feeds?

A news feed is eventually consistent, so a propagation delay of a few seconds is acceptable - fan-out workers drain a queue asynchronously. The one exception is the author's own view: they should see their post immediately, which is handled by injecting the just-created post directly into their feed response rather than waiting for fan-out.

Design a News Feed: System Design Interview 2026

A news feed is two systems wearing one name. There is the moment a user posts, which must be cheap and instant for them - and there is the moment a follower opens the app, which must assemble a relevant, ranked feed in under a couple of hundred milliseconds. Connecting the two is a fan-out problem, and it is dominated by one brutal asymmetry: most accounts have a few hundred followers, and a handful have a hundred million. Any design that ignores that asymmetry falls apart on the first celebrity post.

This walkthrough assumes the 6-step system design framework and applies it at senior depth. It is Part 5 of a system design series.

The Problem
Step 1 - Clarify Requirements
Step 2 - Estimate Scale
Step 3 - API and Data Model
Step 4 - High-Level Design
Step 5 - Deep Dive: Fan-Out on Write, on Read, and the Hybrid
Step 6 - Bottlenecks and Trade-offs
Reference Architecture
Common Mistakes in the Interview
Quick Reference
Related Articles

The Problem

We are designing the home feed of a social platform: when a user opens the app, they see recent posts from the accounts they follow, in a useful order. The canonical examples are the Twitter/X home timeline and the Instagram feed.

The senior framing is that this is a read-heavy aggregation over a producer set with an extreme skew. Feed reads dominate posts, so the read path must be fast - which argues for precomputing each feed. But precomputing a feed means doing work proportional to a poster's follower count, and that count ranges over six orders of magnitude. The entire design is the search for a strategy that keeps both the write cost and the read cost bounded.

Step 1 - Clarify Requirements

Functional requirements:

A user can publish a post.
A user can view their home feed: recent posts from accounts they follow.
The feed is paginated for infinite scroll.

Out of scope (name, then defer): the follow/social-graph service itself - we assume getFollowers(userId) and getFollowees(userId) exist - the media storage for post content, and the machine-learning ranking model, which we treat at the system level only.

Non-functional requirements:

Read-heavy. Feed opens vastly outnumber posts - assume a ratio well above 50:1.
Low read latency. A feed open should complete in roughly 200 ms.
Eventual consistency is fine. A post taking a few seconds to reach followers' feeds is acceptable.
Extreme fan-out skew. Most accounts have hundreds of followers; a few have tens of millions. This is the defining constraint.

Two questions to settle: chronological or ranked? Modern feeds are ranked, so we design for a ranked feed and note that chronological is the simpler subset. And followed accounts only, or recommendations too? We design the classic followed-accounts feed.

Step 2 - Estimate Scale

The arithmetic here is what exposes the celebrity problem.

Reads. Assume 500 million daily active users opening the feed ~10 times/day: 5 billion feed reads/day ≈ ~58,000 reads/sec average, perhaps ~250,000/sec at peak.

Posts. At ~0.2 posts/user/day, that is 100 million posts/day ≈ ~1,200 posts/sec average.

Fan-out amplification. With an average of ~200 followers per account, fan-out on write turns 100M posts into 100M x 200 = 20 billion feed insertions/day ≈ ~230,000 inserts/sec. That is the routine cost of pushing.

The celebrity number. An account with 100 million followers, posting once, generates 100 million feed insertions from that single post. No amount of averaging hides this - one celebrity post is four days of the entire platform's average fan-out volume. This single figure is why a pure push design fails.

Storage. Feeds store post IDs, not post bodies: ~800 entries x ~16 bytes ≈ ~13 KB per user, x 500M users ≈ ~6 TB of precomputed feed data, which lives in a fast store.

Step 3 - API and Data Model

POST /api/posts
  body: { "authorId": "...", "content": "..." }
  202 Accepted   { "postId": "..." }
 
GET /api/feed?cursor=<opaque>
  200 OK   { "items": [ ... ], "nextCursor": "<opaque>" }

The core entities:

Entity	Key fields
Post	`postId`, `authorId`, `content`, `createdAt` - the source of truth
Social graph	follower-followee edges; accessed via the follow service
Feed	`userId` -> a bounded, ordered list of `(postId, score)` - derived, rebuildable

The feed is best held as a per-user sorted set (post ID keyed by score), capped at ~800 entries so memory stays bounded and old items fall off.

Pagination is cursor-based, never offset-based. An offset (skip N) is meaningless on a feed whose head shifts every second - new posts push items down, so consecutive pages overlap and skip. A cursor encodes the (score, postId) of the last item seen and asks for items strictly after it, which stays correct under head insertions.

Step 4 - High-Level Design

The write path and the read path are separated by a fan-out stage and a queue.

flowchart TD
    Client([Client]) -->|POST post| PS[Post Service]
    PS --> PStore[(Posts Store)]
    PS -->|new-post event| Q[Fan-Out Queue]
    Q --> FW[Fan-Out Workers]
    FW -->|getFollowers| Graph[(Social Graph)]
    FW -->|insert postId| Feeds[(Feed Store<br/>per-user sorted sets)]
    Client -->|GET feed| FS[Feed Service]
    FS -->|read precomputed| Feeds
    FS -->|pull celebrity posts| PStore
    FS -->|hydrate IDs| PCache[(Post Cache)]
    FS -->|rank + merge| Client

Figure 1. The architecture separates the write path (post -> store -> fan-out queue -> fan-out workers -> per-user feeds) from the read path (read precomputed feed -> pull celebrity posts -> hydrate -> rank -> return). Posting returns 202 immediately; all of the expensive fan-out work runs asynchronously through the queue, the same backpressure-friendly pattern as Part 3.

Posting is cheap: store the post, emit an event, return 202. Fan-out workers consume the event asynchronously - the same durable-queue pattern from Part 3. The feed service reads a precomputed feed, pulls a little extra, ranks, hydrates post IDs into bodies, and returns.

Step 5 - Deep Dive: Fan-Out on Write, on Read, and the Hybrid

This is the core. The question is when the feed is assembled, and the answer is "it depends on the poster" - which is the hybrid model.

Fan-out on write (push)

When a user posts, immediately insert that post's ID into the precomputed feed of every follower.

The feed read becomes trivial - the feed is already assembled, so reading it is a single sorted-set lookup. The cost moves to the write: a post with F followers is F insertions, done asynchronously by fan-out workers off a queue. For the overwhelming majority of accounts - hundreds of followers - this is exactly the right trade, because it spends cheap, deferrable write work to make the hot, latency-critical read path nearly free.

It breaks on two cases. Celebrities: 100M followers means 100M insertions per post - enormous write amplification and minutes of propagation lag. And inactive followers: pushing into the feeds of users who will not open the app for weeks is pure wasted work.

Fan-out on read (pull)

When a user opens their feed, query the recent posts of every account they follow, then merge and rank.

Now the write is trivial - just store the post - and celebrities cost nothing extra to post. But the read explodes: a user following 1,000 accounts triggers a 1,000-way scatter-gather on the hot path, every single feed open. Since reads dominate by 50:1 or more, paying the cost there is backwards as a default.

flowchart LR
    subgraph Push["Fan-out on write (push)"]
        P1[User posts] -->|F insertions now| P2[Every follower feed]
        P3[Reader opens feed] -->|1 read| P4[Done - precomputed]
    end
    subgraph Pull["Fan-out on read (pull)"]
        L1[User posts] -->|1 write| L2[Posts store]
        L3[Reader opens feed] -->|N queries| L4[Merge + rank now]
    end

Figure 2. The two pure strategies side by side and where each one breaks. Push pays the cost at write time and is fast on read; pull pays it on read and is slow there. Since reads outnumber writes by 50:1 or more, pull is wrong as a default - and push is wrong for a celebrity. Neither alone works, which is what motivates the hybrid model.

The hybrid model

Neither pure strategy works; the senior answer combines them by poster type:

Normal accounts (below a follower threshold) use fan-out on write. Their posts push into followers' precomputed feeds.
Celebrity accounts (above the threshold) are skipped by fan-out. Their posts are not pushed anywhere.
At feed-read time, the feed service reads the user's precomputed feed and pulls recent posts from the handful of celebrities that user follows, then merges and ranks the two sets.

This bounds both costs. Writes never explode, because the accounts with explosive follower counts are exactly the ones excluded from push. Reads never explode, because a user follows only a few celebrities - a pull of ~5 celebrity timelines plus one precomputed-feed read, not a 1,000-way scatter-gather.

sequenceDiagram
    participant U as User
    participant FS as Feed Service
    participant F as Feed Store
    participant P as Posts Store
 
    U->>FS: GET /feed
    FS->>F: read precomputed feed (push portion)
    F-->>FS: post IDs from normal accounts
    FS->>P: pull recent posts from followed celebrities
    P-->>FS: celebrity post IDs
    Note over FS: merge + rank both sets
    FS-->>U: ranked, hydrated feed page

Figure 3. The hybrid feed read in action. The reader's precomputed feed (filled by normal-account fan-out) is combined with on-demand pulls from the handful of celebrities they follow, then ranked and returned. A user with five celebrity follows pays five extra reads, not a thousand-way scatter-gather - this bounded cost is what makes the hybrid work.

The threshold is the tuning knob. Lower it and fewer giant fan-outs occur, but more accounts are pulled at read time; raise it and the reverse. It is set from the platform's actual follower distribution, and a senior answer says so rather than quoting a magic number.

Ranking and feed assembly

A chronological feed just sorts by createdAt. A ranked feed scores each candidate - recency, the viewer's affinity with the author, predicted engagement - and orders by score. Ranking runs at read time over a bounded candidate set (the precomputed entries plus celebrity pulls), so the score can use fresh engagement data; the precomputation's job is only to keep that candidate set small, not to finalise the order. The author's own just-posted item is injected directly into their feed response so they get read-your-writes consistency without waiting for fan-out.

Consistency model

The feed is eventually consistent: fan-out workers drain their queue over seconds, so a post reaches followers shortly after publication, not instantly - and for a feed that is fine. The deliberate exception is the author's own view, handled by direct injection as above. The feed store itself is derived data: it can always be rebuilt by pulling from the posts store and social graph, so it is a rebuildable cache, not a source of truth.

Failure modes

Fan-out backlog. A surge of posting grows the fan-out queue and propagation lag rises. The hybrid model already removes the worst offenders (celebrities) from the queue; beyond that, the queue absorbs the spike and workers autoscale - the Part 3 backpressure pattern.
Feed store node loss. Because the feed is derived, a lost shard degrades affected users to pull-based assembly while their feeds are rebuilt - degradation, not data loss.
Celebrity post thundering herd. Millions opening their feeds right after a celebrity posts all pull and hydrate the same post. That is a hot key on the post object, solved by the post cache - the caching and hot-key techniques from Part 4.

Multi-region

Feeds are regional - each region holds its users' feed stores, with region affinity by userId. The posts store and social graph are globally replicated. Fan-out workers in each region consume a global new-post event stream, so a post by an author in one region reaches followers in every region. Celebrity pulls hit the local posts replica.

Evolution path

Stage	Approach
Launch	Pure fan-out on read - simplest, fine when follower counts are small
Growth	Add fan-out on write so the hot read path is precomputed
Scale	Hybrid by poster type, ranked feed, multi-region feed stores

Adopt cursor-based pagination from day one - offset pagination is a trap that is painful to undo - and keep posts and feeds as separate stores. Defer the celebrity hybrid, the ranking model, and multi-region until follower skew and traffic force them.

Observability

Track feed-load p99 (the headline), fan-out queue depth and lag (this is the propagation-delay metric), fan-out write rate, feed-store hit ratio, ranking latency, and the post count per celebrity tier. Reasonable SLOs: 99% of feed loads under 200 ms, and 99% of posts visible to followers within 30 seconds.

Step 6 - Bottlenecks and Trade-offs

Celebrity fan-out is the defining bottleneck - unbounded write amplification, resolved only by excluding celebrities from push.
Feed read latency is kept low by precomputing the common case, which is the entire reason push exists.
Feed store memory is bounded by capping each feed at a few hundred entries.
Ranking cost is contained by ranking only a bounded candidate set, never the whole follow graph.
Pagination stability under a shifting feed head requires cursors, and a ranked feed may additionally snapshot the candidate set per session.

Reference Architecture

The pattern this problem teaches, reusable well beyond feeds:

Precompute the expensive read for the common case (fan-out on write), compute on read for the skewed tail (fan-out on read), and merge the two - a hybrid that keeps both the write cost and the read cost bounded.

flowchart LR
    subgraph Common["Common case - precomputed"]
        C1[Post] -->|push, async| C2[(Per-user feeds)]
    end
    subgraph Tail["Skewed tail - computed on read"]
        T1[Celebrity post] --> T2[(Posts store)]
    end
    Read[Feed read] --> C2
    Read --> T2
    Read --> Merge[Merge + rank]

Figure 4. The reference architecture as two flows ending at a merge. The common case is precomputed for cheap reads; the skewed tail is computed on demand; the merge brings them together at read time. Whenever a read-heavy aggregation has a bimodal producer distribution - a few prolific producers and many small ones - this shape applies.

The same shape recurs wherever a read-heavy aggregation faces a skewed producer distribution: a notification inbox, an activity stream, a "latest from people you follow" panel. Precompute for the many, compute on demand for the few, and merge - rather than forcing one strategy onto a workload that has two populations.

Common Mistakes in the Interview

Choosing pure push or pure pull and never confronting the celebrity problem.
Offset-based pagination on a feed whose head shifts continuously, producing duplicates and gaps.
Pushing into inactive users' feeds, spending write work on feeds no one will read.
Fanning out synchronously in the post request path instead of off a queue.
Treating the feed store as the source of truth rather than rebuildable derived data.
Forgetting read-your-writes for the author's own post.
Storing full post bodies in every feed instead of IDs hydrated at read time.

Quick Reference

Topic	Key Point
Core pattern	Hybrid fan-out: push for normal accounts, pull for celebrities, merge at read
Fan-out on write	Fast reads, write cost proportional to follower count - the common-case default
Fan-out on read	Cheap writes, expensive scatter-gather reads - right only for the skewed tail
Celebrity problem	One post = millions of writes; fix by excluding celebrities from push
Threshold	Follower count that splits push from pull; tuned from the real distribution
Ranking	At read time, over a bounded candidate set so scores stay fresh
Pagination	Cursor-based on `(score, postId)`; offset breaks on a shifting head
Feed store	Per-user capped sorted set; derived, rebuildable - not the source of truth
Consistency	Eventually consistent; inject the author's own post for read-your-writes
Multi-region	Regional feed stores; global post-event stream feeds fan-out everywhere

System Design Interview Problems: A Senior's Roadmap - the full series index and pattern library.
System Design Interview Guide: The 6-Step Framework - the method this walkthrough applies.
Design a Notification Service - Part 3; the durable-queue fan-out pattern reused here.
Design a Distributed Cache - Part 4; post hydration and hot-key handling for viral posts.
Design a Chat System - Part 6; the large-group fan-out reappears as broadcast channels.

This is Part 5 of a 12-part system design series where each post solves one problem around one core pattern. Next: Design a Chat System.

Design a News Feed: System Design Interview 2026

Table of Contents

The Problem

Step 1 - Clarify Requirements

Step 2 - Estimate Scale

Step 3 - API and Data Model

Step 4 - High-Level Design

Step 5 - Deep Dive: Fan-Out on Write, on Read, and the Hybrid

Fan-out on write (push)

Fan-out on read (pull)

The hybrid model

Ranking and feed assembly

Consistency model

Failure modes

Multi-region

Evolution path

Observability

Step 6 - Bottlenecks and Trade-offs

Reference Architecture

Common Mistakes in the Interview

Quick Reference

Ready to ace your interview?

Table of Contents

The Problem

Step 1 - Clarify Requirements

Step 2 - Estimate Scale

Step 3 - API and Data Model

Step 4 - High-Level Design

Step 5 - Deep Dive: Fan-Out on Write, on Read, and the Hybrid

Fan-out on write (push)

Fan-out on read (pull)

The hybrid model

Ranking and feed assembly

Consistency model

Failure modes

Multi-region

Evolution path

Observability

Step 6 - Bottlenecks and Trade-offs

Reference Architecture

Common Mistakes in the Interview

Quick Reference

Related Articles

Ready to ace your interview?