You’re in a system design interview. You say “we’ll use PostgreSQL” and immediately follow it with “and we’ll shard it across 16 nodes.” The interviewer asks: “How much traffic are you expecting?” You don’t have a number. You just sharded because it sounded like the senior thing to do.

Here’s the thing - most teams shard too early. A single PostgreSQL node can handle far more than people think. The decision to shard should come from actual numbers, not vibes. This post gives you those numbers.

The Numbers You Need

Before diving into each database, here’s the reference table you’ll keep coming back to. These are realistic single-node numbers on reasonable hardware (8-16 cores, 64GB RAM, NVMe SSD) - not marketing benchmarks on 128-core machines.

Database Reads/sec Writes/sec Practical Data Limit Latency (P50)
PostgreSQL 100K-200K 20K-70K 500GB-4TB 1-5ms
Redis 100K-500K 100K-400K 25-50GB <1ms
MongoDB 20K-67K 10K-30K 256GB-1TB 1-5ms
DynamoDB 3K/partition 1K/partition Unlimited (managed) 5-10ms
Cassandra 15K-25K/node 30K/node 500GB-1TB/node 1-5ms

These numbers assume indexed lookups or simple key-value access - not full table scans or complex joins. Your actual numbers depend on query complexity, data model, and indexing.

PostgreSQL

The database that keeps surprising people with how far it scales on a single node. The PostgreSQL documentation covers tuning parameters in detail.

Throughput:

  • Simple SELECT by primary key: 100,000-200,000 ops/sec
  • Read-only pgbench on tuned hardware: 1,000,000+ TPS (synthetic, but shows the ceiling)
  • Mixed read-write (TPC-B style): 20,000-70,000 TPS
  • Single-row INSERTs: 10,000-25,000/sec (network-bound). Batched inserts: 100,000+/sec
  • Analytical scans: 5-10 million rows/sec per core

Memory and storage:

  • Best performance when working set (active indexes + hot data) fits in shared_buffers + OS page cache
  • With 64GB RAM, you can comfortably serve 50-200GB of hot data
  • Maximum table size: 32TB. Maximum database size: no hard limit
  • Performance degrades when queries regularly hit disk instead of cache

When to shard:

  • Most workloads run fine on a single node up to 500GB-1TB
  • Vertical scaling + read replicas + connection pooling (PgBouncer) + table partitioning can push this further
  • Notion ran on a single PostgreSQL instance serving 30 million users before they needed to shard
  • Consider sharding at: 1TB+ data with heavy write loads, or when you exceed 50,000+ write TPS

Before you shard, try these first:

  1. Connection pooling (PgBouncer) - most Postgres bottlenecks are connection limits, not CPU
  2. Read replicas for read-heavy workloads
  3. Table partitioning for large tables (time-series data, logs)
  4. Better indexing - a missing index is cheaper to add than a shard is to manage
  5. VACUUM tuning - bloated tables kill performance silently

Redis

The fastest option on this list - because everything lives in memory. The Redis documentation covers benchmarking and configuration.

Throughput:

  • GET/SET without pipelining: 70,000-100,000 ops/sec
  • GET/SET with pipelining (16 commands): 400,000-500,000 ops/sec
  • Redis 8.x with IO threads (16 cores): up to 3.5 million ops/sec with pipelining
  • Sub-millisecond latency for virtually all operations

Memory and storage:

  • Everything lives in RAM. Your dataset size equals your available memory
  • With 64GB RAM, expect 30-50GB of usable storage (after accounting for fragmentation, metadata overhead, and persistence buffers)
  • Each key has ~50-70 bytes of metadata overhead
  • Persistence options: RDB snapshots (periodic) or AOF (append-only file, near-real-time)

When to shard (Redis Cluster):

Common mistake: Using Redis as a primary database instead of a cache. If your Redis data exceeds 50GB, ask yourself whether this data should be in a disk-based database with Redis as a caching layer in front.

MongoDB

Flexible schema, decent throughput, but the sharding decision is more nuanced than people think. The MongoDB sharding documentation is worth reading before making decisions.

Throughput:

  • Document lookups by indexed field: 20,000-50,000 ops/sec
  • When working set fits in WiredTiger cache: up to 67,000 ops/sec
  • Single-document inserts: 10,000-30,000 ops/sec
  • MongoDB 8.0 improved query performance by ~45% over 7.0 on large datasets

Memory and storage:

  • WiredTiger cache: 50% of RAM minus 1GB (default). With 64GB RAM, that’s ~31GB cache
  • Performance is best when working set fits in WiredTiger cache
  • Snappy compression (default) gives 2-4x compression ratio
  • Max document size: 16MB

When to shard:

  • Single replica set handles 256-512GB comfortably
  • The commonly cited threshold: shard when data exceeds 256GB or write throughput saturates a single primary
  • MongoDB Atlas supports up to 4TB per shard, but 1TB per shard is the recommended operational limit

Before you shard, try these first:

  1. Better indexes - MongoDB’s explain() is your best friend
  2. Read preference: route reads to secondaries for read-heavy workloads
  3. Schema design - embedding vs referencing makes a huge performance difference
  4. WiredTiger cache tuning

DynamoDB

Different category entirely - it’s a managed service that handles sharding for you. The DynamoDB developer guide explains capacity modes in depth. But you still need to understand the numbers.

Throughput:

  • Per partition: 3,000 read capacity units (RCU) and 1,000 write capacity units (WCU)
  • Eventually consistent reads double read capacity (6,000/partition)
  • Default per-table limit: 40,000 RCU and 40,000 WCU (can be increased)
  • On-demand mode starts at 4,000 WCU / 12,000 RCU and auto-scales to 2x previous peak
  • The service handles 89 million requests/sec globally across all customers

Memory and storage:

  • Each partition holds up to 10GB
  • Max item size: 400KB (much smaller than MongoDB’s 16MB)
  • Table size: unlimited - partitions are added automatically
  • Latency: single-digit milliseconds. With DAX (in-memory cache): microsecond reads

The catch - hot partitions:

DynamoDB auto-shards, but all traffic for a partition key hits one partition. If your partition key is date and everyone queries today’s data, you’ll hit the 3,000 RCU / 1,000 WCU limit on that single partition while other partitions sit idle.

Bad:  partition_key = "2026-03-14"  (all traffic hits one partition)
Good: partition_key = "user_12345"  (traffic spreads across partitions)

When DynamoDB makes sense:

  • You need zero operational overhead for scaling
  • Access patterns are simple (key-value or key-range lookups)
  • You can model your data around partition keys effectively
  • You’re already in the AWS ecosystem

When it doesn’t: Complex queries, joins, aggregations, or ad-hoc analytics. You’ll pay a fortune in RCUs scanning tables.

Cassandra

Built for writes. Built for distribution. Not built for single-node deployments. The Cassandra documentation covers its architecture and tuning.

Throughput (per node):

  • Writes: ~30,000 ops/sec (mean latency 1.7ms, P99 5.5ms)
  • Reads: 15,000-25,000 ops/sec
  • Cassandra 4.0 improved throughput by 25-33% over 3.11
  • Write-optimized by design: append-only log-structured merge tree means writes are always fast

Memory and storage:

  • Modest RAM needs: 16-64GB recommended per node
  • Heap: 8-16GB. Remaining RAM used for OS page cache
  • Recommended data per node: 500GB-1TB
  • Maximum: 3-5TB per node (but bootstrap and repair times become painful beyond 1TB)
  • Leave 25-50% free disk space for compaction overhead

Cassandra is always distributed:

Unlike the other databases on this list, Cassandra is designed to run as a cluster from day one. A single-node Cassandra deployment is not a recommended production configuration. Minimum recommended: 3 nodes.

When Cassandra makes sense:

  • Write-heavy workloads (logging, time-series, IoT sensor data)
  • Multi-region deployments with tunable consistency
  • Very large datasets (10TB+) that need to scale linearly
  • You’re okay with limited query flexibility (no joins, limited aggregations)

When it doesn’t: Read-heavy workloads with complex queries. Cassandra’s read path (checking multiple SSTables, bloom filters, merge) is fundamentally slower than its write path.

The Decision Framework

Before reaching for sharding, walk through this checklist:

Step 1: Know your actual numbers

Question How to find out
Current ops/sec? Application metrics, database slow query log
Data size? pg_database_size(), db.stats(), CloudWatch
Growth rate? Monthly data size trend over last 6 months
Read vs write ratio? Application metrics or database audit logs

Step 2: Are you at the limit?

Database You’re fine Getting tight Time to scale
PostgreSQL <10K mixed TPS, <200GB 10K-50K TPS, 200GB-1TB >50K write TPS, >1TB
Redis <50K ops/sec, <10GB 50K-100K ops/sec, 10-25GB >100K ops/sec, >25GB
MongoDB <10K ops/sec, <100GB 10K-30K ops/sec, 100-256GB >30K ops/sec, >256GB
DynamoDB <1K WCU/partition Approaching 1K WCU/partition Hot partition throttling
Cassandra <10K writes/node 10K-25K writes/node >25K writes/node, >1TB/node

Step 3: Try cheaper alternatives first

Most “we need to shard” conversations should actually be:

  • “We need better indexes”
  • “We need a read replica”
  • “We need connection pooling”
  • “We need to cache hot data in Redis”
  • “We need to archive cold data”

Sharding adds permanent operational complexity. Every query needs routing logic. Cross-shard joins don’t exist. Schema migrations happen N times instead of once. Rebalancing shards is a multi-day operation.

Real-World Scale References

To calibrate your intuition, here’s what well-known companies run on what:

Company Database Scale Notes
Notion PostgreSQL 30M users, single node (until 2021) Sharded only when tables hit multiple TB
Instagram PostgreSQL Billions of rows Sharded, but started single-node
Discord Cassandra (migrated to ScyllaDB) Trillions of messages Write-heavy, needed Cassandra’s write throughput
Uber Cassandra + Redis Millions of trips/day Cassandra for durable writes, Redis for real-time
Lyft DynamoDB Millions of rides Chose DynamoDB for zero-ops scaling
Coinbase MongoDB (migrated to PostgreSQL) Millions of transactions Hit MongoDB limits, moved to Postgres

Why Not Just Use Postgres For Everything?

If Postgres handles 100K+ reads and 20K-70K writes per second on a single node, supports JSON, full-text search, and even time-series with extensions - why bother with anything else?

For most teams, you shouldn’t. Postgres covers 90% of use cases. But here’s the honest 10% where it doesn’t:

Problem Why Postgres Struggles What to Use Instead
Sub-millisecond cache reads Disk-based, can’t guarantee <1ms Redis
100K+ writes/sec sustained Single-node write scaling hits a wall, sharding is painful Cassandra, DynamoDB
Schema changing every sprint ALTER TABLE on a 500GB table can lock for minutes MongoDB
Multi-region active-active writes Postgres multi-master is still immature Cassandra, DynamoDB, CockroachDB
Zero-ops at any scale Postgres needs vacuuming, connection pooling, replica management, index tuning DynamoDB
10TB+ with linear scaling Postgres at 10TB is a full-time DBA job Cassandra, DynamoDB
Session/rate-limit storage Overkill, and you need TTL expiry built-in Redis

The pragmatic approach: Start with Postgres. Add Redis as a cache layer when you need sub-millisecond reads or session storage. That combination handles more scale than 95% of companies will ever need. Only reach for Cassandra, DynamoDB, or MongoDB when you hit a specific wall that Postgres genuinely can’t solve - not because a blog post told you “NoSQL scales better.”

Instagram, Notion, and OpenAI all run on Postgres. If it’s good enough for them at their scale, it’s probably good enough for your startup.

The One Mistake Everyone Makes

Sharding on day one because the system design textbook said so. In reality:

  • A well-tuned PostgreSQL node handles more traffic than 95% of startups will ever see
  • Adding a read replica doubles your read capacity with near-zero complexity
  • Redis in front of your database eliminates most read scaling problems
  • Vertical scaling (bigger machine) is underrated - doubling your cores is cheaper than managing a distributed database

The right time to shard is when you’ve exhausted vertical scaling, read replicas, caching, and query optimization - and you still can’t keep up. For most teams, that day never comes.

Bottom Line

Know the numbers before you architect. A single PostgreSQL node does 100K+ reads/sec. Redis does 500K+ with pipelining. MongoDB handles 256GB before blinking. These are big numbers - bigger than most applications will ever need. Start simple, measure everything, and scale only when the metrics tell you to - not when your gut does.