1. Requirements & Scope (5 min)
Functional Requirements
- Users can upload, download, and delete files from any device
- Files automatically sync across all connected devices
- Users can share files/folders with other users (view/edit permissions)
- File versioning — users can view and restore previous versions
- Offline support — changes made offline sync when connectivity resumes
Non-Functional Requirements
- Availability: 99.99% — users depend on this for critical documents
- Durability: 99.999999999% (11 nines) — losing a user’s file is unacceptable
- Latency: Small file sync < 5 seconds end-to-end between devices. Large file upload should show progress and be resumable.
- Consistency: Strong consistency for file metadata (rename, move, delete must be immediately reflected). Eventual consistency acceptable for sync propagation to other devices (within seconds).
- Scale: 500M users, 100M DAU, average user stores 10GB, peak uploads 10M files/hour
- Bandwidth efficiency: Only transfer changed parts of files (delta sync)
2. Estimation (3 min)
Storage
- 500M users × 10GB avg = 5 exabytes (5,000PB) total storage
- This is the defining constraint — everything revolves around efficient storage
Traffic
- 10M file uploads/hour ÷ 3600 = ~2,800 uploads/sec
- Average file: 500KB → 1.4GB/sec upload bandwidth
- Sync events (metadata changes): 10x file uploads = 28,000 events/sec
Metadata
- Average user: 2,000 files → 500M × 2,000 = 1 trillion file metadata records
- Each record: ~500 bytes → 500TB metadata
3. API Design (3 min)
// File operations
POST /api/v1/files/upload
Headers: Content-Range: bytes 0-1048575/5242880 // chunked upload
Body: <binary chunk>
Response 200: { "upload_id": "up_123", "next_offset": 1048576 }
POST /api/v1/files/upload/complete
Body: { "upload_id": "up_123", "filename": "doc.pdf", "parent_id": "folder_abc" }
Response 201: { "file_id": "f_xyz", "version": 1 }
GET /api/v1/files/{file_id}/download
Response 200: redirect to pre-signed S3 URL
GET /api/v1/files/{file_id}/versions
Response 200: [{ "version": 3, "size": 524288, "modified_at": "...", "modified_by": "..." }]
POST /api/v1/files/{file_id}/restore?version=2
// Sync
GET /api/v1/sync/changes?cursor={cursor}
Response 200: {
"changes": [
{ "type": "create", "file_id": "f_xyz", "path": "/docs/notes.md", ... },
{ "type": "modify", "file_id": "f_abc", "version": 3, ... },
{ "type": "delete", "file_id": "f_def", ... }
],
"cursor": "c_next_123",
"has_more": false
}
// Sharing
POST /api/v1/files/{file_id}/share
Body: { "user_email": "[email protected]", "permission": "edit" }
Key decisions:
- Chunked uploads (1MB chunks): enables resumable uploads, handles network interruptions, allows parallel chunk uploads
- Cursor-based sync: client maintains a cursor, fetches incremental changes since last sync
- Pre-signed download URLs: client downloads directly from S3/CDN, not through our servers
4. Data Model (3 min)
Metadata Store (PostgreSQL, sharded by user_id)
Table: files
file_id (PK) | uuid
user_id | bigint (shard key)
parent_folder_id | uuid (FK → files, nullable for root)
name | varchar(255)
is_folder | boolean
current_version | int
size | bigint
checksum | char(64) -- SHA-256 of content
created_at | timestamptz
modified_at | timestamptz
is_deleted | boolean (soft delete)
Table: file_versions
file_id | uuid (FK)
version | int
size | bigint
checksum | char(64)
s3_key | varchar(200)
chunks | jsonb -- [{ chunk_hash, s3_key, offset, size }]
created_at | timestamptz
PK: (file_id, version)
Table: sharing
file_id | uuid
shared_with_user_id | bigint
permission | enum('view', 'edit')
PK: (file_id, shared_with_user_id)
Block/Chunk Store (S3)
- Files are split into chunks (4MB each)
- Each chunk stored by its content hash:
chunks/{sha256_hash} - Content-addressable storage enables deduplication
5. High-Level Design (12 min)
Upload Path
Desktop Client detects file change
→ Split file into 4MB chunks
→ Compute SHA-256 hash per chunk
→ Send chunk hashes to Sync Service: "which chunks do you already have?"
→ Sync Service checks Block Metadata DB → returns list of missing chunks
→ Client uploads only missing chunks to Block Storage Service
→ Block Storage Service writes to S3
→ Client sends "upload complete" with chunk list to Sync Service
→ Sync Service:
→ Creates new file_version in PostgreSQL
→ Updates current_version
→ Publishes change event to Message Queue
→ Notification Service pushes to other connected devices
Download/Sync Path
Desktop Client → long-poll or WebSocket to Notification Service
→ Receives "file X changed" event
→ Client calls GET /sync/changes?cursor=...
→ Gets list of changed files with chunk manifests
→ For each changed file:
→ Identify which chunks client already has locally
→ Download only missing chunks from S3 (via CDN)
→ Reassemble file locally
→ Update local cursor
Components
- Sync Service: Coordinates uploads/downloads, manages file metadata, computes diffs
- Block Storage Service: Handles chunk upload to S3, deduplication checks
- Notification Service: Real-time push to connected clients (WebSocket/long-poll)
- PostgreSQL (sharded): File metadata, versions, sharing
- S3: Chunk storage (content-addressable)
- Redis: Chunk existence cache, active session tracking
- Kafka: Change event stream for cross-device sync
- CDN: Chunk download acceleration
6. Deep Dives (15 min)
Deep Dive 1: Chunking and Delta Sync
The core insight: When a user edits a 100MB file, they typically change only a few KB. Uploading the entire 100MB again is wasteful. Delta sync uploads only the changed chunks.
How chunking works:
- File is split into 4MB blocks using content-defined chunking (CDC)
- CDC uses a rolling hash (Rabin fingerprint) to find chunk boundaries based on content — not fixed offsets
- Why CDC over fixed-size chunks? If you insert 1 byte at the beginning of a file, fixed-size chunking shifts every boundary → all chunks change. CDC boundaries are content-dependent, so only the affected chunk changes.
Deduplication:
- Chunks are stored by SHA-256 hash:
chunks/{hash} - Before uploading, client sends chunk hashes to the server
- Server responds with which chunks already exist (from any user’s files)
- Common files (OS files, popular libraries) are stored once across all users
- Deduplication ratio is typically 50-60% across the entire system
Bandwidth savings example:
- User edits a 100MB file, changing 2 pages in the middle
- File has 25 chunks (4MB each)
- Only 1 chunk actually changed (different hash)
- Upload: 4MB instead of 100MB → 96% bandwidth reduction
Deep Dive 2: Conflict Resolution
The problem: User edits file on laptop (offline), same file is edited on phone (also offline). Both come online — whose version wins?
Strategy: Last-writer-wins + conflict copies
-
Each file has a version number. Each device tracks the last-known version.
-
When device syncs, it sends its base version + new content.
-
Server checks: is the base version == current version?
- Yes: Clean update. Increment version, store new content.
- No: Conflict. Someone else modified the file between your last sync and now.
-
On conflict:
- Server stores the new upload as a “conflict copy”:
budget.xlsx (Chirag's laptop's conflicted copy 2026-02-22) - Both versions are preserved — no data loss
- User resolves manually (same as Dropbox’s actual behavior)
- Server stores the new upload as a “conflict copy”:
Why not auto-merge? For text files, we could attempt a three-way merge (like git). But for binary files (DOCX, XLSX, images), auto-merge is impossible. The safest approach is always preserving both versions.
Vector clocks (advanced): For shared folders with many collaborators, we can use vector clocks instead of simple version numbers to track causality across devices more precisely. This helps distinguish “concurrent edits” (true conflict) from “sequential edits that arrived out of order” (not a conflict).
Deep Dive 3: Storage at Exabyte Scale
5 exabytes is the defining challenge. At S3 pricing ($0.023/GB/month), that’s $115M/month in storage alone.
Cost optimization layers:
-
Deduplication (50-60% savings): Content-addressable chunks mean identical files across users are stored once. Popular files (OS updates, npm packages, Docker images) get massive deduplication.
-
Compression (30-40% additional): Compress each chunk before storing. Text/code compresses 5:1, office docs 2:1, already-compressed files (JPEG, MP4) get no benefit. Apply compression selectively based on content type.
-
Tiered storage:
- Hot tier (S3 Standard): Files accessed in last 30 days
- Warm tier (S3 IA): Files accessed in last 30-90 days (50% cheaper)
- Cold tier (S3 Glacier): Files not accessed in 90+ days (90% cheaper)
- Auto-promotion on access: if a cold file is downloaded, move it to hot tier
-
Version cleanup: Store all versions for 30 days, then keep only every-other version for 30-90 days, then keep only monthly snapshots for 90+ days. This dramatically reduces version storage.
-
Deleted file cleanup: Soft-deleted files are recoverable for 30 days, then permanently deleted. S3 lifecycle policy enforces this.
Result: Effective cost per GB stored drops from $0.023 to ~$0.005-0.008/month after all optimizations.
7. Extensions (2 min)
- Real-time collaboration: For docs/sheets, add OT/CRDT layer for simultaneous editing (Google Docs style). Completely separate infrastructure from file sync.
- Search: Full-text search across file names and content. Extract text from PDFs, DOCX, images (OCR) → index in Elasticsearch.
- Smart sync (selective sync): On devices with limited storage, show file listings but download content only on demand. Store metadata locally, fetch content from cloud.
- Activity feed: Track all file operations (create, edit, share, delete) in an event log. Show as an activity stream in the UI.
- Compliance: GDPR right to deletion, SOC2 audit logs, encryption at rest (AES-256) and in transit (TLS 1.3), region-specific data residency.