1. Requirements & Scope (5 min)
Functional Requirements
- Create polls — Creators/advertisers can define survey questions (multiple choice, single select, rating scale) and attach them to specific timestamps in a video
- Render polls mid-roll — Display a non-intrusive overlay poll at the configured timestamp during video playback, without pausing or blocking the video
- Collect votes — Record user responses in real time, enforce one vote per user per poll, and allow changing a vote before the poll closes
- Real-time results — Show live vote counts / percentages to the user after they vote (instant feedback)
- Analytics dashboard — Provide creators with detailed poll analytics: response rate, demographic breakdown, completion funnel, and A/B test results for different poll placements
Non-Functional Requirements
- Availability: 99.95% — a poll failing to render is a missed data collection opportunity, but not as critical as video playback itself
- Latency: Poll UI must render within 200ms of the trigger timestamp. Vote submission must ACK within 100ms (perceived instant).
- Consistency: Votes must be counted exactly once. Read-after-write consistency for a user seeing their own vote. Aggregate counts can be eventually consistent (1-2 second delay is fine).
- Scale: YouTube has 800M daily active viewers, 500M hours of video watched/day. If 5% of videos have polls and 20% of viewers interact → ~16B poll impressions/day → 185K poll renders/sec, 37K votes/sec at peak.
- Durability: Every vote must be durably stored. Zero data loss on votes.
2. Estimation (3 min)
Traffic
- Daily active viewers: 800M
- Videos watched per viewer: ~8/day → 6.4B video views/day
- Videos with polls: 5% → 320M poll-eligible views/day
- Poll impression rate (viewer sees the poll): 60% → 192M poll impressions/day
- Vote rate (viewer actually votes): 30% of impressions → 57.6M votes/day
- Peak vote QPS: 57.6M / 86400 × 3 (peak multiplier) ≈ 2,000 votes/sec (average), 6,000 votes/sec (peak)
- Peak poll render QPS: 192M / 86400 × 3 ≈ 6,700 renders/sec (peak)
Storage
- Poll definitions: 10M active polls × 2 KB (question, options, targeting rules, schedule) = 20 GB — easily fits in a relational DB
- Votes: 57.6M votes/day × 365 days × 3 years retention = 63B votes
- Each vote: poll_id (8B) + user_id (8B) + option_id (4B) + timestamp (8B) + metadata (32B) ≈ 60 bytes
- Total: 63B × 60B = 3.78 TB — manageable with partitioned storage
- Aggregated counts: Per-poll, per-option counters. 10M polls × 5 options × 16B = 800 MB — trivially small, lives in Redis
Bandwidth
- Poll render payload: ~5 KB (question text, options, styling, targeting metadata)
- 6,700 renders/sec × 5 KB = 33.5 MB/s — negligible compared to video streaming bandwidth
3. API Design (3 min)
Creator-Facing APIs
// Create a poll attached to a video
POST /api/v1/videos/{video_id}/polls
Body: {
"question": "What feature should we build next?",
"type": "single_select", // single_select | multi_select | rating
"options": ["Dark mode", "Offline support", "AI search", "Better perf"],
"trigger_time_sec": 145, // show at 2:25 in the video
"display_duration_sec": 15, // auto-dismiss after 15s
"targeting": {
"geo": ["US", "CA", "GB"],
"demographics": { "age_min": 18, "age_max": 45 },
"sample_pct": 10 // only show to 10% of viewers (A/B test)
},
"close_after_hours": 168 // stop accepting votes after 7 days
}
→ 201 { "poll_id": "p_abc123", ... }
// Get poll analytics
GET /api/v1/polls/{poll_id}/analytics
→ 200 {
"impressions": 145230,
"votes": 43120,
"response_rate": 0.297,
"results": [
{ "option": "Dark mode", "votes": 18200, "pct": 42.2 },
{ "option": "AI search", "votes": 12500, "pct": 29.0 },
...
],
"demographics": { ... },
"ab_test": { "variant_a_response_rate": 0.31, "variant_b_response_rate": 0.26 }
}
Viewer-Facing APIs
// Fetch polls for a video (called when video starts playing)
GET /api/v1/videos/{video_id}/polls?viewer_id={uid}
→ 200 {
"polls": [
{
"poll_id": "p_abc123",
"trigger_time_sec": 145,
"question": "What feature should we build next?",
"options": [...],
"user_vote": null // or option_id if already voted
}
]
}
// Submit a vote
POST /api/v1/polls/{poll_id}/vote
Body: { "option_id": "opt_2", "viewer_id": "u_xyz" }
→ 200 { "results": { "opt_1": 42.2, "opt_2": 29.0, ... }, "total_votes": 43121 }
// Change vote (idempotent PUT)
PUT /api/v1/polls/{poll_id}/vote
Body: { "option_id": "opt_3", "viewer_id": "u_xyz" }
→ 200 { "results": { ... } }
Key Decisions
- Pre-fetch polls at video start: The player fetches all polls for the video when playback begins. This avoids a network request at the exact trigger timestamp (which could cause a visible delay).
- Vote response includes live results: After voting, the user immediately sees percentages. This is the “social proof” hook that drives engagement.
- Targeting evaluated client-side: The server sends all eligible polls plus targeting rules. The client-side SDK evaluates targeting (geo, demographics, A/B bucket) to avoid an extra server round-trip at trigger time.
4. Data Model (3 min)
Polls Table (PostgreSQL — strong consistency for definitions)
| Column | Type | Notes |
|---|---|---|
| poll_id | UUID (PK) | Globally unique |
| video_id | VARCHAR(11) | YouTube video ID, indexed |
| creator_id | BIGINT | FK to creator accounts |
| question | TEXT | Poll question text |
| type | ENUM | single_select, multi_select, rating |
| options | JSONB | Array of {id, text, display_order} |
| trigger_time_sec | INT | Seconds into the video |
| display_duration_sec | INT | How long to show the overlay |
| targeting | JSONB | Geo, demographics, sample_pct, etc. |
| status | ENUM | draft, active, paused, closed |
| close_at | TIMESTAMP | When to stop accepting votes |
| created_at | TIMESTAMP |
Votes Table (Cassandra — high write throughput, partitioned by poll_id)
| Column | Type | Notes |
|---|---|---|
| poll_id | UUID (partition key) | |
| viewer_id | BIGINT (clustering key) | Ensures uniqueness: one vote per user per poll |
| option_id | UUID | Which option they chose |
| voted_at | TIMESTAMP | |
| metadata | MAP<TEXT,TEXT> | Device, geo, referrer, A/B variant |
Why Cassandra for votes?
- Writes dominate reads (every vote is a write; reads are aggregated separately)
- Partition by poll_id: all votes for a poll are co-located → efficient aggregation
- Built-in upsert semantics (INSERT with same PK = update) → natural dedup for “change vote”
- Scales horizontally to handle 6K writes/sec with ease
Vote Aggregates (Redis — real-time counters)
| Key | Type | Example |
|---|---|---|
poll:{poll_id}:counts |
Hash | { "opt_1": 18200, "opt_2": 12500, "opt_3": 8400, "opt_4": 4020 } |
poll:{poll_id}:total |
Integer | 43120 |
poll:{poll_id}:voted:{viewer_id} |
String | “opt_2” (used for dedup check, TTL = poll close time) |
Analytics Store (ClickHouse — OLAP for dashboard queries)
- Materialized from the Kafka vote stream
- Columns: poll_id, video_id, creator_id, option_id, viewer_id, voted_at, geo, age_bucket, device, ab_variant
- Queries: response rate by demographic, time-series of votes, A/B test significance
5. High-Level Design (12 min)
Architecture Overview
┌──────────────────────────────────────────────────────────────────┐
│ YouTube Video Player │
│ ┌──────────────┐ ┌──────────────┐ ┌────────────────────────┐ │
│ │ Video Stream │ │ Poll SDK │ │ Poll Overlay UI │ │
│ │ (HLS/DASH) │ │ (pre-fetched │ │ (renders at trigger │ │
│ │ │ │ poll data) │ │ timestamp, shows │ │
│ │ │ │ │ │ results after vote) │ │
│ └──────────────┘ └──────┬───────┘ └────────────────────────┘ │
│ │ │
└───────────────────────────┼───────────────────────────────────────┘
│ HTTPS
▼
┌─────────────┐
│ API GW / │
│ LB (L7) │
└──────┬──────┘
│
┌─────────────┼──────────────┐
▼ ▼ ▼
┌────────────┐ ┌──────────┐ ┌────────────┐
│ Poll Read │ │ Vote │ │ Creator │
│ Service │ │ Service │ │ Analytics │
│ │ │ │ │ Service │
└─────┬──────┘ └────┬─────┘ └─────┬──────┘
│ │ │
┌────┘ ┌────┼────┐ │
▼ ▼ ▼ ▼ ▼
┌─────────┐ ┌───────┐ │ ┌──────┐ ┌──────────┐
│PostgreSQL│ │ Redis │ │ │Kafka │ │ClickHouse│
│ (polls) │ │(counts│ │ │(vote │ │(analytics│
│ │ │+dedup)│ │ │stream│ │ OLAP) │
└─────────┘ └───────┘ │ └──┬───┘ └──────────┘
▼ │
┌──────────┤
│Cassandra ││
│ (votes) ││
└──────────┘│
▲ │
│ ▼
┌──────────────┐
│ Vote Consumer│
│ (aggregation │
│ + analytics │
│ pipeline) │
└──────────────┘
Component Breakdown
1. Poll SDK (Client-Side)
- Embedded in the YouTube player (web, iOS, Android)
- On video load: fetches all polls for the video via Poll Read Service
- Evaluates targeting rules locally (geo from IP, demographics from user profile, A/B bucket from hash(user_id + poll_id))
- At trigger timestamp: renders non-intrusive overlay (bottom-third of screen, semi-transparent)
- Handles vote submission, optimistic UI update (show results immediately), and retry on failure
2. Poll Read Service
- Serves
GET /videos/{video_id}/polls— fetches poll definitions from PostgreSQL (cached in Redis/CDN with 60s TTL) - Checks if the viewer has already voted (Redis lookup) and includes their previous vote in the response
- Stateless, horizontally scalable
3. Vote Service
- Handles vote submission: dedup check in Redis, write to Kafka, update Redis counters atomically
- Flow: Check
poll:{poll_id}:voted:{viewer_id}→ if exists, it’s a vote change (decrement old option, increment new) → HINCRBY on counts hash → publish to Kafka → ACK to client - Returns updated aggregate counts in the response
4. Kafka Vote Stream
- Durable log of all vote events
- Consumed by: (a) Cassandra writer for persistent vote storage, (b) ClickHouse sink for analytics, (c) real-time aggregation for any downstream systems
- Partitioned by poll_id for ordering guarantees within a poll
5. Creator Analytics Service
- Serves the creator dashboard with poll performance data
- Queries ClickHouse for complex analytics (response rate by geo, demographic breakdown, A/B test results)
- Pre-computes hourly/daily rollups for fast dashboard loads
Request Flow: Viewer Votes on a Poll
Player SDK → API Gateway → Vote Service
Vote Service:
1. Validate: poll exists, not closed, option_id valid
2. Redis: GET poll:{p_abc}:voted:{u_xyz}
→ null (first vote) or "opt_1" (changing vote)
3. Redis Pipeline (atomic):
- SET poll:{p_abc}:voted:{u_xyz} = "opt_2" EX {ttl}
- HINCRBY poll:{p_abc}:counts opt_2 1
- (if changing) HINCRBY poll:{p_abc}:counts opt_1 -1
4. Kafka: produce VoteEvent{poll_id, viewer_id, option_id, timestamp, metadata}
5. Redis: HGETALL poll:{p_abc}:counts → {opt_1: 18200, opt_2: 12501, ...}
6. Return 200 with live results to client
6. Deep Dives (15 min)
Deep Dive 1: Preventing Duplicate Votes & Vote Integrity
The Problem: A user could submit the same vote multiple times due to network retries, client bugs, or intentional abuse. We must ensure exactly-once semantics for votes.
Layer 1: Client-Side Dedup
- After submitting a vote, the SDK stores
poll_id → option_idin local storage - On subsequent page loads or video replays, the SDK checks local storage before rendering the poll
- If already voted, it shows results instead of the voting UI
- This prevents accidental double-votes but is easily bypassed (clear storage, different device)
Layer 2: Redis Dedup (Real-Time)
- Key:
poll:{poll_id}:voted:{viewer_id}with TTL matching poll close time - Before processing a vote, Vote Service checks this key
- If the key exists with the same option → idempotent, return current results
- If the key exists with a different option → treat as a vote change (atomic swap)
- If the key doesn’t exist → new vote, proceed
Layer 3: Cassandra Upsert (Durable)
- Cassandra partition key: (poll_id), clustering key: (viewer_id)
- INSERT/UPDATE with same (poll_id, viewer_id) is an upsert — no duplicates at the storage level
- This is the source of truth for vote integrity
Handling Redis Failure:
- If Redis is down, fall back to Cassandra for dedup (slower but correct)
- Read from Cassandra:
SELECT option_id FROM votes WHERE poll_id = ? AND viewer_id = ? - This adds ~5ms latency but maintains correctness
Preventing Bot/Abuse Votes:
- Require authenticated users only (no anonymous voting)
- Rate limit: max 10 vote submissions per user per minute across all polls
- Behavioral signals: if a user votes on 100 polls in 1 minute, flag as bot
- For high-stakes polls (advertiser surveys), require CAPTCHA verification on suspicious accounts
Deep Dive 2: Real-Time Vote Aggregation & Response Rate Optimization
Real-Time Aggregation Architecture:
The challenge is providing live vote counts to millions of concurrent viewers while maintaining accuracy.
Vote arrives → Redis HINCRBY (atomic counter increment)
↓
Redis Hash: poll:{poll_id}:counts
{ "opt_1": 18200, "opt_2": 12501, "opt_3": 8400, "opt_4": 4020 }
↓
On each vote response, return HGETALL → client shows live %
Why Redis counters work at this scale:
- HINCRBY is O(1), atomic, and ~0.1ms on a single Redis instance
- A single Redis shard handles 100K+ HINCRBY/sec easily
- Even the most viral poll won’t exceed 10K votes/sec (human click speed is the bottleneck)
- Hot poll sharding: if a single poll exceeds Redis throughput, shard counters across N Redis instances and sum on read
Response Rate Optimization:
Response rate is the key metric. Higher response rate = more valuable data. Techniques:
-
Timing optimization: Don’t show the poll in the first 10 seconds (viewer hasn’t engaged yet) or last 10 seconds (about to leave). Sweet spot: 30-60% through the video.
-
Visual treatment:
- Semi-transparent overlay on the bottom third — doesn’t block content
- Subtle entrance animation (slide up) to catch attention without being jarring
- Auto-dismiss after 15 seconds if no interaction (don’t annoy the viewer)
- Show a small “1 question” teaser 3 seconds before the full poll appears
-
Social proof: Show “X people have voted” before the user votes. This creates a bandwagon effect.
-
Post-vote reward: After voting, show the results with a satisfying animation. This trains users that voting has an immediate payoff.
-
A/B testing framework:
- Each poll can define
sample_pctand targeting variants - Hash(user_id + poll_id) % 100 determines the A/B bucket (deterministic, consistent across devices)
- Test variables: trigger time, display duration, visual treatment, question wording
- Track response rate per variant, compute statistical significance (chi-squared test), auto-promote the winning variant
- Each poll can define
Engagement Metrics Pipeline:
Poll rendered (impression event) → Kafka
User interacts (vote event) → Kafka
User dismisses (dismiss event) → Kafka
Video continues past poll → Kafka
All events → ClickHouse → Materialized views:
- response_rate = votes / impressions
- dismiss_rate = dismissals / impressions
- completion_impact = avg(watch_time_with_poll) vs avg(watch_time_without_poll)
Deep Dive 3: Integration with Ads System & Advertiser Surveys
The Problem: YouTube’s ad system is a multi-billion dollar revenue engine. Advertiser surveys (Brand Lift studies) must integrate seamlessly without disrupting ad delivery, frequency capping, or revenue optimization.
Advertiser Survey Flow:
- Advertiser creates a Brand Lift campaign: “Did you see an ad for Product X in the last 7 days?”
- System identifies two groups: exposed (saw the ad) and control (didn’t see the ad)
- Both groups see the same survey → difference in responses = brand lift
Integration Architecture:
Ad Server → "User U saw Ad A at time T"
↓
Exposure Log (BigQuery)
↓
Survey Targeting Service:
- For Brand Lift: select exposed + control users
- For creator polls: use creator-defined targeting
- For YouTube research: random sampling
↓
Poll Read Service → includes survey in video polls
Key Integration Constraints:
- Frequency capping: A user should see at most 1 survey per session (every 7 days). This is enforced by the Poll Read Service checking
last_survey_shown:{viewer_id}in Redis. - Ad pod conflict: Don’t show a survey during an ad break. The player SDK coordinates with the ad SDK to avoid overlapping UI elements.
- Revenue priority: If showing a survey would displace a paid ad, the ad wins. Surveys are lower priority in the ad auction.
- Control group integrity: Control group users must never see the ad. The ad server and survey system share an exclusion list.
- Statistical rigor: Brand Lift surveys require minimum sample sizes (typically 2,000 exposed + 2,000 control) for statistical significance. The system tracks sample sizes and stops collecting once significance is achieved (sequential testing).
7. Extensions (2 min)
- Multi-language support: Auto-translate poll questions based on viewer locale using a translation service, with creator approval for machine translations before they go live. Store translations as a JSONB map in the polls table.
- Polls in live streams: For live/premiere content, enable real-time polls that creators can trigger from their dashboard. Uses WebSocket push instead of pre-fetch. Results update in real time for all viewers simultaneously (a shared social experience).
- Gamification & rewards: Award viewers points/badges for participating in polls. Track streaks (voted in 5 polls this week). Drives habitual engagement with surveys and increases long-term response rates.
- Content-aware poll suggestions: Use ML to analyze the video content (transcript, visual segments) and suggest relevant poll questions to the creator. “Your video mentions 3 products — want to ask viewers which they prefer?”
- Cross-video poll campaigns: Allow creators to run a poll campaign across multiple videos (same question, aggregated results). Useful for ongoing audience feedback like “What series should I start next?” with responses collected over a month of uploads.