Back of Envelope Calculations in System Design

“How many requests per second should we design for?”

This question shows up in every system design round. And most candidates handle it in one of two ways:

Skip it entirely. “Let’s just say it’s a lot.” Then they draw boxes and hope for the best.
Over-engineer it. They pull out exact division, multiply everything precisely, and burn five minutes on arithmetic that adds zero architectural signal.

Both are wrong. Back-of-envelope estimation isn’t about getting the right number. It’s about getting the right order of magnitude so you can make informed architectural decisions.

The difference between 100 QPS and 100,000 QPS isn’t just a bigger number. It’s a fundamentally different system. One fits on a single server. The other needs distributed caching, load balancing, and database sharding. The estimation tells you which world you’re in.

The Reference Numbers

Before doing any calculation, you need a mental toolkit of reference numbers. You don’t need to memorize these exactly. Round numbers are fine. The point is having anchors.

Time

Unit	Value
1 day	~100,000 seconds (86,400, round to 100K)
1 month	~2.5 million seconds
1 year	~30 million seconds

Use 100K seconds/day for everything. It makes division trivial.

Data Size

Unit	Value
1 character (ASCII)	1 byte
1 character (UTF-8, avg)	2-3 bytes
A tweet (280 chars)	~0.5 KB with metadata
A typical JSON API response	1-10 KB
A photo (compressed)	200 KB - 1 MB
A short video (1 min, compressed)	5-10 MB
A high-res image	2-5 MB

Scale Prefixes

Prefix	Value	Shorthand
Kilo	10^3	Thousand
Mega	10^6	Million
Giga	10^9	Billion
Tera	10^12	Trillion
Peta	10^15	Quadrillion

Quick conversions:

1 million seconds = ~12 days
1 billion seconds = ~32 years
1 TB = 1,000 GB = 1,000,000 MB

Latency (Order of Magnitude)

Operation	Latency
L1 cache reference	~1 ns
L2 cache reference	~4 ns
Main memory reference	~100 ns
SSD random read	~100 us
HDD seek	~10 ms
Round trip within same datacenter	~0.5 ms
Round trip cross-continent	~100-150 ms
Packet round trip CA to Netherlands	~150 ms

Real-World Scale References

Service	Approximate Scale
Google Search	~100K QPS
Twitter	~300K tweets/day at peak
WhatsApp	~100B messages/day
YouTube	~500 hours of video uploaded/minute
Instagram	~100M photos uploaded/day
Netflix	~250M subscribers, ~1B hours streamed/week

These aren’t exact. They’re directional. When someone says “design a system like Twitter,” you now know the ballpark.

The Estimation Framework

Every estimation follows the same four steps:

Step 1: Anchor on Users

Start with DAU (Daily Active Users). If the interviewer doesn’t give a number, ask. If they say “assume reasonable scale,” pick something concrete:

Small/startup: 1M DAU
Medium: 10M-50M DAU
Large (Twitter/Instagram scale): 100M-500M DAU

Step 2: Estimate Actions Per User

How many times does an average user perform the core action per day?

Social media post: 0.1-1 per day (most users lurk, few post)
Messages sent: 10-50 per day
Searches: 5-10 per day
Feed refreshes: 10-20 per day
URL shortener: 0.1 per day (most users click, not create)

Step 3: Calculate QPS

Total daily actions = DAU x actions per user
QPS = Total daily actions / 100,000 (seconds in a day)
Peak QPS = QPS x 2-3 (for traffic spikes)

Step 4: Estimate Storage

Storage per day = Total daily actions x size per action
Storage per year = Storage per day x 365
Total storage = Storage per year x retention period

That’s it. Four steps. Should take 60-90 seconds.

Worked Examples

Example 1: URL Shortener

Given: 100M DAU

Reads vs. Writes:

Write: each user creates ~0.1 short URLs/day = 10M writes/day
Read: each short URL gets clicked ~10x = 100M reads/day

QPS:

Write QPS: 10M / 100K = 100 writes/sec
Read QPS: 100M / 100K = 1,000 reads/sec
Peak read QPS: ~3,000 reads/sec

Storage:

Each record: short URL (7 chars) + long URL (~200 chars) + metadata = ~500 bytes
Daily: 10M x 500 bytes = 5 GB/day
Yearly: 5 GB x 365 = ~1.8 TB/year
5-year retention: ~9 TB total

What this tells you:

Read-heavy (10:1 ratio) -> caching is essential
3K peak QPS -> single database can handle this with read replicas
9 TB -> fits in a single well-provisioned database, but consider partitioning for growth
This is not a massive-scale problem. No need for complex distributed architecture.

Example 2: Chat System (WhatsApp-scale)

Given: 500M DAU

Messages:

Average user sends 40 messages/day
Total: 500M x 40 = 20B messages/day

QPS:

Message QPS: 20B / 100K = 200,000 writes/sec
Peak: ~500,000 writes/sec

Storage:

Average message: 100 bytes (text) + 200 bytes (metadata) = ~300 bytes
Daily: 20B x 300 bytes = 6 TB/day
Yearly: ~2 PB/year

Bandwidth:

Incoming: 200K messages/sec x 300 bytes = 60 MB/sec
With media (10% of messages have a 200KB image): 20K x 200KB = 4 GB/sec

What this tells you:

500K peak writes/sec -> single database won’t work. Need horizontal sharding.
2 PB/year -> need a distributed storage system (not a single RDBMS)
4 GB/sec bandwidth for media -> need CDN, object storage (S3-style)
This is a massive-scale problem. Every component needs horizontal scaling.

Example 3: Twitter-like Feed

Given: 200M DAU

Write path (posting):

1% of users post per day = 2M posts/day
Post QPS: 2M / 100K = 20 writes/sec (surprisingly low!)

Read path (feed):

Average user refreshes feed 10x/day = 2B feed requests/day
Feed QPS: 2B / 100K = 20,000 reads/sec
Peak: ~50,000 reads/sec

Fan-out:

Average user has 200 followers
Each post fans out to 200 timelines
Fan-out operations/sec: 20 x 200 = 4,000 timeline writes/sec
Celebrity with 10M followers: single post = 10M timeline writes. This is the fan-out problem.

Storage:

Each post: ~1 KB (text + metadata)
Daily: 2M x 1KB = 2 GB/day (posts are small!)
Timeline cache per user: last 200 posts x 1KB = 200 KB
Total timeline cache: 200M x 200KB = 40 TB

What this tells you:

Write QPS is tiny (20/sec). The challenge isn’t writing posts.
Read QPS is massive (50K/sec). Caching is critical.
Fan-out is the real problem. A celebrity post triggers millions of writes.
Need hybrid approach: fan-out-on-write for normal users, fan-out-on-read for celebrities.
40 TB timeline cache -> Redis cluster with sharding.

The Rounding Rules

Speed matters more than precision. Here are the shortcuts:

Round everything to the nearest power of 10.

86,400 seconds in a day? Use 100,000.
365 days in a year? Use 400 (or just 10 months x 30 days = 300).
1,048,576 bytes in a MB? Use 1,000,000.

Use 2x-3x for peak traffic. Most systems see 2-3x average traffic during peaks. For spiky systems (e-commerce during sales, sports during live events), use 5-10x.

Round storage up, not down. It’s better to over-provision storage than under-provision. Storage is cheap. Running out of storage at 3 AM is not.

State the ratio, not just the number. “1,000 reads/sec and 100 writes/sec” is more useful than “1,100 total QPS.” The 10:1 ratio tells you to optimize for reads.

Common Mistakes

1. Spending Too Long

The estimation should take 60-90 seconds. If you’re doing long division on the whiteboard, you’ve lost the plot. Round aggressively and move on.

2. Not Separating Reads and Writes

“Total QPS is 10,000.” That’s incomplete. A system with 9,000 reads and 1,000 writes is architected completely differently from one with 5,000 reads and 5,000 writes. Always split them.

3. Forgetting Peak Traffic

Average QPS is meaningless for capacity planning. Systems don’t fail at average load. They fail at peak. Always multiply by 2-3x (or more for spiky workloads).

4. Ignoring the Fan-Out Effect

A social media post doesn’t create one write. It creates N writes, where N is the number of followers. A user with 1M followers creates 1M fan-out writes from a single post. This is often the bottleneck, not the ingestion rate.

5. Getting Lost in the Math

The interviewer doesn’t care if the answer is 1,847 QPS or 2,000 QPS. Both lead to the same architecture. What matters is: “This is in the low thousands, so a single server with caching can handle it.” That’s the insight. The number is just a vehicle.

6. Not Connecting Estimation to Architecture

The worst thing you can do is calculate numbers and then ignore them. Every number should lead to a decision:

Estimation	Architectural Signal
< 1K QPS	Single server, maybe with read replica
1K-10K QPS	Load balancer + multiple app servers + read replicas
10K-100K QPS	Horizontal scaling, caching layer (Redis/Memcached), possibly sharding
100K+ QPS	Distributed system, CDN, database sharding, message queues
< 1 TB storage	Single database instance
1-10 TB	Consider partitioning, compression
10-100 TB	Sharded database or distributed storage
100 TB+	Distributed file system (HDFS, S3), data lake architecture

The Quick-Reference Cheat Sheet

When you’re in the interview and need to move fast:

Users to QPS:

QPS = (DAU x actions_per_user) / 100,000

Storage per year:

Storage = DAU x actions_per_user x bytes_per_action x 365

Bandwidth:

Bandwidth = QPS x bytes_per_request

Machines needed (rough):

A single modern server handles ~10K-50K simple requests/sec
Machines = Peak QPS / 10,000 (conservative)

Cache size:

Follow the 80/20 rule: 20% of data serves 80% of reads
Cache = 0.2 x daily_read_data

Closing Thought

Back-of-envelope estimation is not a math exercise. It’s a calibration tool. The goal is to spend 60 seconds understanding the scale of the problem so that every architectural decision that follows is grounded in reality.

A system designed for 100 QPS looks nothing like a system designed for 100,000 QPS. The estimation is what tells you which one to build. Get the order of magnitude right, connect it to architecture, and move on. That’s all there is to it.

The Reference Numbers#

Time#

Data Size#

Scale Prefixes#

Latency (Order of Magnitude)#

Real-World Scale References#

The Estimation Framework#

Step 1: Anchor on Users#

Step 2: Estimate Actions Per User#

Step 3: Calculate QPS#

Step 4: Estimate Storage#

Worked Examples#

Example 1: URL Shortener#

Example 2: Chat System (WhatsApp-scale)#

Example 3: Twitter-like Feed#

The Rounding Rules#

Common Mistakes#

1. Spending Too Long#

2. Not Separating Reads and Writes#

3. Forgetting Peak Traffic#

4. Ignoring the Fan-Out Effect#

5. Getting Lost in the Math#

6. Not Connecting Estimation to Architecture#

The Quick-Reference Cheat Sheet#

Closing Thought#

Comments

The Reference Numbers

Time

Data Size

Scale Prefixes

Latency (Order of Magnitude)

Real-World Scale References

The Estimation Framework

Step 1: Anchor on Users

Step 2: Estimate Actions Per User

Step 3: Calculate QPS

Step 4: Estimate Storage

Worked Examples

Example 1: URL Shortener

Example 2: Chat System (WhatsApp-scale)

Example 3: Twitter-like Feed

The Rounding Rules

Common Mistakes

1. Spending Too Long

2. Not Separating Reads and Writes

3. Forgetting Peak Traffic

4. Ignoring the Fan-Out Effect

5. Getting Lost in the Math

6. Not Connecting Estimation to Architecture

The Quick-Reference Cheat Sheet

Closing Thought