1. Requirements & Scope (5 min)
Functional Requirements
- Users see a personalized feed of posts from friends, pages, and groups they follow
- Feed is ranked by relevance (not chronological) — a scoring model determines post order
- Support multiple content types: text, images, videos, links, live streams, stories
- New posts appear in followers’ feeds within seconds (near real-time)
- Infinite scroll with cursor-based pagination — load more posts as user scrolls
Non-Functional Requirements
- Availability: 99.99% — the news feed IS the product. Any downtime is a headline.
- Latency: < 200ms to render the first page of feed. Subsequent pages < 100ms.
- Consistency: Eventual consistency is fine. A post appearing 5 seconds late in some feeds is acceptable. But a post should never permanently disappear from a feed it belongs in.
- Scale: 2 billion DAU, 500M posts/day, each user has ~500 friends on average. Feed generation for 2B users at peak load.
- Freshness: New posts from close friends should appear within 5-10 seconds. Posts from pages/groups can tolerate 30-60 seconds.
2. Estimation (3 min)
Traffic
- 2 billion DAU, each opens feed ~10 times/day
- Feed requests: 2B x 10 = 20 billion feed loads/day = 230K QPS average, 500K QPS peak
- New posts: 500M/day = 5,800 posts/sec
- Each post fans out to ~500 followers (average friends list)
- Fan-out writes: 5,800 x 500 = 2.9 million feed inserts/sec (if fan-out on write)
Storage
- Feed cache per user: store 500 most recent post IDs
- 500 post IDs x 8 bytes = 4KB per user
- 2B users x 4KB = 8 TB for feed cache — fits in a Redis cluster
- Post content storage: 500M posts/day x 5KB avg = 2.5 TB/day → standard DB/object store
Key Insight
The core trade-off is fan-out on write (pre-compute feeds, 2.9M writes/sec) vs fan-out on read (compute feed at request time, 500K QPS each querying 500 friends). Neither extreme works alone — the answer is a hybrid approach based on follower count.
3. API Design (3 min)
// Get personalized news feed
GET /feed?cursor={cursor}&limit=20
Headers: { "Authorization": "Bearer {token}" }
Response 200: {
"posts": [
{
"post_id": "abc123",
"author_id": "user_456",
"author_name": "Alice",
"content_type": "photo",
"text": "Beautiful sunset!",
"media_url": "https://cdn.fb.com/photos/abc.jpg",
"created_at": 1708632060,
"reactions": {"like": 142, "love": 38, "haha": 5},
"comment_count": 23,
"share_count": 7,
"ranking_score": 0.94 // internal, not exposed
},
...
],
"next_cursor": "eyJ0...", // opaque cursor for pagination
"has_more": true
}
// Create a new post
POST /posts
Body: {
"text": "Beautiful sunset!",
"media_ids": ["media_789"],
"privacy": "friends", // public, friends, only_me, custom
"tagged_users": ["user_111"]
}
Response 201: { "post_id": "abc123" }
// React to a post (live-updates feeds of friends)
POST /posts/{post_id}/reactions
Body: { "type": "like" }
Key Decisions
- Cursor-based pagination (not offset) — stable under concurrent inserts, no “shifting” problem
- Cursor encodes the ranking score + timestamp of the last seen post (not a simple offset)
- Feed content is denormalized in the response — no N+1 queries on the client
- Privacy filtering happens server-side during feed generation, not client-side
4. Data Model (3 min)
Posts Table (MySQL/PostgreSQL, sharded by author_id)
Table: posts
post_id (PK) | bigint (Snowflake ID)
author_id | bigint
content_type | enum(text, photo, video, link, live)
text | text
media_urls | json
privacy | enum(public, friends, only_me, custom)
created_at | timestamp
updated_at | timestamp
is_deleted | boolean
Social Graph (TAO-like graph store)
Edges:
(user_A, FRIENDS_WITH, user_B) -- bidirectional
(user_A, FOLLOWS, page_C) -- unidirectional
(user_A, MEMBER_OF, group_D) -- unidirectional
Stored as adjacency lists in a distributed graph DB
Key: user_A:friends → [user_B, user_C, user_D, ...]
Cached in Memcache/TAO cache (hit rate > 99%)
Feed Cache (Redis)
Key: feed:{user_id}
Type: Sorted Set
Member: post_id
Score: ranking_score (float, higher = more relevant)
Size: top 500 posts per user
TTL: 24 hours (refreshed on access)
Post Engagement (Redis counters + async DB sync)
Key: engagement:{post_id}
Type: Hash
Fields: {likes: 142, loves: 38, comments: 23, shares: 7, views: 5420}
Why These Choices
- Sharded MySQL for posts — simple, well-understood, excellent tooling. Shard by author_id for write locality.
- Graph store for social connections — adjacency list access pattern (get all friends of X) is the primary query. TAO-like system (Facebook’s actual choice) with aggressive caching.
- Redis sorted set for feed cache — O(log N) insert, O(K) range query for top-K, natural scoring/ranking support.
5. High-Level Design (12 min)
Architecture
User creates post
│
▼
┌───────────────┐
│ Post Service │ → Write to Posts DB (sharded MySQL)
│ │ → Upload media to CDN
└───────┬───────┘
│
▼
┌───────────────────────────────────────────────────────┐
│ Fan-out Service │
│ │
│ Decision: fan-out on write vs fan-out on read? │
│ │
│ If author has < 10,000 followers (99.9% of users): │
│ → FAN-OUT ON WRITE │
│ → Get author's follower list │
│ → For each follower: compute ranking score │
│ → Insert (post_id, score) into follower's feed cache │
│ │
│ If author has ≥ 10,000 followers (celebrities/pages): │
│ → FAN-OUT ON READ │
│ → Store post in author's "outbox" only │
│ → When follower loads feed, pull from celebrity │
│ outboxes and merge with pre-computed feed │
└───────────────────────────────────────────────────────┘
│ │
▼ ▼
┌───────────────┐ ┌───────────────┐
│ Feed Cache │ │ Celebrity │
│ (Redis) │ │ Outbox Cache │
│ Per-user │ │ (Redis) │
│ sorted set │ │ Per-celebrity │
└───────┬───────┘ │ recent posts │
│ └───────┬───────┘
│ │
└──────────┬─────────────────┘
│
▼
┌───────────────────────────────────────────┐
│ Feed Assembly Service │
│ │
│ On feed request: │
│ 1. Read pre-computed feed from cache │
│ 2. Fetch celebrity outboxes (followed by │
│ this user) │
│ 3. Merge and re-rank │
│ 4. Apply privacy filters │
│ 5. Hydrate posts (add content, author │
│ info, engagement counts) │
│ 6. Return paginated result │
└───────────────────────────────────────────┘
│
▼
Client (infinite scroll, cursor-based pagination)
Ranking Pipeline
For each candidate post in user's feed:
score = w1 * affinity(user, author) // how close are they?
+ w2 * post_quality(post) // engagement rate, content type
+ w3 * timeliness(post.created_at) // time decay function
+ w4 * content_type_preference(user) // does user engage with photos? videos?
+ w5 * diversity_bonus(post) // avoid 5 posts from same person
Where:
affinity = f(messages_exchanged, profile_views, likes_given, comment_threads)
post_quality = f(like_rate, comment_rate, share_rate, time_spent_viewing)
timeliness = 1 / (1 + (now - created_at) / half_life) // exponential decay, half_life = 6 hours
Components
- Post Service: Handles CRUD for posts. Writes to sharded MySQL. Triggers fan-out asynchronously via message queue.
- Fan-out Service: Reads social graph, determines fan-out strategy (write vs read based on follower count threshold), pushes to feed caches. Consumes from Kafka topic.
- Feed Cache (Redis cluster): Stores pre-computed feed per user as sorted sets. 8TB total across the cluster.
- Celebrity Outbox Cache: Stores recent posts from high-follower accounts. Small (100K accounts x 100 posts x 8 bytes = 80MB).
- Feed Assembly Service: Merges pre-computed feed + celebrity outboxes, re-ranks, filters, hydrates with content. Stateless, horizontally scaled.
- Ranking Service: ML model that scores each post for a given user. Called during fan-out (lightweight scoring) and at assembly time (re-ranking).
- Social Graph Service (TAO): Provides follower lists, friend lists, mutual friends. Heavily cached.
- CDN: Serves all media (images, videos, thumbnails). Feed API returns CDN URLs.
6. Deep Dives (15 min)
Deep Dive 1: Fan-Out On Write vs Fan-Out On Read — The Hybrid Approach
Fan-out on write (push model):
When user A creates a post:
1. Get A's followers: [B, C, D, ..., Z] (500 followers)
2. For each follower: insert post_id into their feed cache
3. Total writes: 500
Pros:
- Feed reads are pre-computed → O(1) to fetch feed
- 200ms feed latency easily achievable
- New posts appear in feeds within seconds
Cons:
- Celebrity problem: Kim Kardashian posts → 300M fan-out writes
300M writes / 100K writes/sec = 50 minutes to propagate one post
- Wasted writes: many followers are inactive, will never see the post
- Feed cache invalidation on unfriend/unfollow is complex
Fan-out on read (pull model):
When user B requests their feed:
1. Get B's friends list: [A, C, D, ..., Z] (500 friends)
2. For each friend: fetch their recent posts
3. Merge, rank, return top 20
Pros:
- No write amplification for celebrities
- Always fresh — computes on demand
- No wasted work for inactive users
Cons:
- Slow: 500 friends x 10 posts = 5000 post lookups + ranking per request
- At 500K QPS: 2.5 billion lookups/sec — expensive
- Latency: 500ms+ for feed generation (too slow)
Hybrid approach (Facebook’s actual design):
Threshold: 10,000 followers
For 99.9% of users (< 10K followers):
→ Fan-out on write
→ Fast, predictable, reasonable write amplification
For 0.1% of users (celebrities, pages, ≥ 10K followers):
→ Fan-out on read
→ Celebrity's posts stored in their outbox only
→ When a user loads their feed:
a. Read pre-computed feed cache (from normal friends) — fast
b. Read outboxes of followed celebrities (typically 20-50 accounts) — fast
c. Merge and re-rank — lightweight
Cost analysis:
- Pre-computed fan-out: 5,800 posts/sec x 500 avg followers = 2.9M writes/sec ✓
- Celebrity posts excluded: top 0.1% had 50% of total fan-out writes → halved to 1.45M writes/sec
- Celebrity outbox reads: 500K QPS x 30 celebrity outboxes = 15M reads/sec from outbox cache
- Total cost: significantly cheaper than either pure approach
Deep Dive 2: Feed Ranking and the Cold Start Problem
Two-stage ranking pipeline:
Stage 1: Candidate Generation (lightweight, broad)
Input: ~2000 candidate posts (from feed cache + celebrity outboxes + ads)
Model: Simple heuristic or lightweight ML model
Output: Top 500 candidates with preliminary scores
Latency budget: 20ms
Stage 2: Ranking (heavy, precise)
Input: 500 candidates
Model: Deep neural network (multi-tower architecture)
Features per (user, post) pair:
- User features: age, location, interests, past engagement patterns
- Post features: content type, author, engagement velocity, freshness
- Contextual: time of day, device type, user's recent activity
- Social: mutual friends who engaged, affinity score
Output: Top 20-50 posts, fully ranked
Latency budget: 50ms
Post-ranking adjustments:
- Diversity: no more than 2 posts from same author in top 10
- Content type mixing: ensure variety (not all photos, not all links)
- Demotions: clickbait detection, misinformation flags, over-shared content
- Ads insertion: blend ads at positions 3, 7, 12, ... (ad auction determines which ads)
Cold start problem:
New user with no friends → empty feed:
1. Onboarding: suggest friends, pages to follow (based on contacts, interests)
2. Bootstrap feed: show popular content from followed pages and groups
3. Explore feed: algorithmically curated content from public posts (Discovery tab)
4. After 10+ friends: transition to normal feed pipeline
New post with zero engagement:
1. Show to a small sample of followers (exploration)
2. Measure early engagement (first 100 impressions)
3. If engagement rate is high → boost distribution (exploit)
4. If engagement rate is low → reduce distribution
5. This is a multi-armed bandit problem (Thompson Sampling or epsilon-greedy)
Deep Dive 3: Pagination — Why Cursor-Based, Not Offset-Based
Offset-based pagination problems:
GET /feed?offset=20&limit=20
Problem 1: Shifting window
- User loads page 1 (posts 1-20) at t=0
- 5 new posts are created at t=1
- User loads page 2 (offset=20): gets posts 16-35 (old posts 16-20 appear again)
- User sees duplicates
Problem 2: Performance
- offset=10000 → database must skip 10,000 rows
- Gets slower as user scrolls deeper into feed
Problem 3: Cache invalidation
- New posts change all offsets
- Every cached page becomes stale
Cursor-based pagination:
GET /feed?cursor=eyJzY29yZSI6MC44NSwidCI6MTcwODYzMjA2MH0=&limit=20
Cursor encodes: { "score": 0.85, "timestamp": 1708632060, "post_id": "last_seen_id" }
How it works:
1. First page: fetch top 20 from feed cache (highest scores)
2. Response includes next_cursor = encoded(last_post.score, last_post.timestamp, last_post.id)
3. Next page: fetch 20 posts with score < cursor.score
(if score ties, use timestamp; if both tie, use post_id as tiebreaker)
Benefits:
- Stable pagination: new posts don't shift existing pages
- O(log N) seek: Redis ZRANGEBYSCORE with score < cursor
- Cacheable: cursor points to a stable position in the sorted set
- Works with real-time insertions (new posts get high scores, appear at top, don't affect deeper pages)
Implementation:
ZREVRANGEBYSCORE feed:{user_id} {cursor_score} -inf LIMIT 0 20
7. Extensions (2 min)
- Real-time feed updates (live feed): Use WebSocket or Server-Sent Events to push new posts to the client without page refresh. When a post is fanned out to a user’s feed cache, also publish to their WebSocket channel. Client-side merges new posts at the top with a “New posts available” banner.
- Content moderation in the feed pipeline: Before fan-out, run posts through an automated moderation pipeline (hate speech detection, nudity detection, misinformation classifiers). Flag or suppress posts that violate policies. This adds 1-2 seconds to fan-out latency but prevents harmful content from ever reaching feeds.
- Multi-feed support: Users expect different feeds: main feed, stories feed, Reels/short video feed, Groups feed, Marketplace feed. Each has its own ranking model and candidate pool, but shares the underlying fan-out and caching infrastructure.
- Feed quality metrics and experimentation: A/B test ranking model changes on 1% of users. Measure metrics: time spent, meaningful social interactions (comments, reactions on friends’ posts vs passive scrolling), user retention. Feed ranking changes are the highest-impact experiments at any social company.
- Cross-posting and aggregation: When a user shares an article that 15 of their friends also shared, aggregate into a single feed item (“Alice, Bob, and 13 others shared this article”) instead of showing 15 identical posts.