1. Requirements & Scope (5 min)
Functional Requirements
- Rider requests a ride (pickup location, destination) and the system finds the best available driver within seconds
- Drivers continuously report their GPS location; the system maintains a real-time view of all available drivers
- Matching algorithm considers distance, ETA, driver rating, vehicle type, and supply-demand balance
- Support ride types: standard, pool/shared, premium, XL — each with different matching criteria
- Handle surge pricing zones and dynamically adjust pricing based on supply-demand ratio
Non-Functional Requirements
- Availability: 99.99% — riders cannot request rides if matching is down. Revenue loss is immediate.
- Latency: Match a rider to a driver within 3-5 seconds. Location updates processed within 1 second.
- Consistency: A driver must never be matched to two riders simultaneously (strong consistency on driver state). Ride pricing can be eventually consistent.
- Scale: 5M concurrent drivers, 20M rides/day, 500K location updates/sec globally.
- Freshness: Driver location must be accurate within 5 seconds for meaningful ETA calculations.
2. Estimation (3 min)
Traffic
- Location updates: 5M drivers × 1 update/4 seconds = 1.25M updates/sec
- Ride requests: 20M rides/day = ~230 rides/sec (peak 5x = 1,150/sec)
- Match queries: Each ride request queries nearby drivers → geospatial index lookup (1,150 QPS at peak)
- ETA calculations: Per match attempt, compute ETA for ~10 candidate drivers = 11,500 ETA computations/sec at peak
Storage
- Driver locations: 5M drivers × 64 bytes (id + lat/lng + timestamp + status) = 320MB (fits in memory)
- Ride history: 20M rides/day × 1KB × 365 = 7.3TB/year
- Geospatial index: In-memory spatial index of 5M points = ~500MB
Key Insight
This is a real-time geospatial matching problem. The core challenge is maintaining an up-to-date spatial index of 5M moving points with 1.25M updates/sec, then efficiently querying it to find optimal matches. The matching itself is a constrained optimization problem (minimize total wait time across all concurrent requests, not just each individual one).
3. API Design (3 min)
Driver Location Updates
POST /v1/drivers/{driver_id}/location
Body: { "lat": 37.7749, "lng": -122.4194, "heading": 270,
"speed_mph": 25, "timestamp": 1708632060, "status": "available" }
Response 200: { "ack": true }
// Sent every 4 seconds when driver app is active
Ride Request
POST /v1/rides/request
Body: {
"rider_id": "rider_123",
"pickup": { "lat": 37.7749, "lng": -122.4194, "address": "123 Market St" },
"destination": { "lat": 37.7849, "lng": -122.4094, "address": "456 Mission St" },
"ride_type": "standard", // standard, pool, premium, xl
"payment_method_id": "pm_abc"
}
Response 202: { "ride_id": "ride_xyz", "status": "matching",
"estimated_pickup_eta": 180, "surge_multiplier": 1.2 }
Match Notification (push to driver)
// Server → Driver via WebSocket / push notification
{
"type": "ride_offer",
"ride_id": "ride_xyz",
"pickup": { "lat": 37.7749, "lng": -122.4194 },
"destination": { "lat": 37.7849, "lng": -122.4094 },
"estimated_fare": "$12-15",
"accept_timeout_seconds": 15
}
// Driver accepts/declines
POST /v1/rides/{ride_id}/respond
Body: { "driver_id": "driver_456", "action": "accept" }
Key Decisions
- Ride request returns 202 (Accepted) immediately — matching happens async, result pushed via WebSocket
- Driver has 15 seconds to accept before the offer goes to the next candidate
- Location updates are fire-and-forget (no response body needed) to minimize latency
4. Data Model (3 min)
Real-Time Driver State (Redis + Geospatial Index)
// Redis GeoSet for spatial queries
GEOADD drivers:available {lng} {lat} {driver_id}
GEORADIUS drivers:available {lng} {lat} 5 km COUNT 20 ASC
// Redis Hash for driver metadata
HSET driver:{driver_id}
status: "available" // available, on_trip, offline
lat: 37.7749
lng: -122.4194
heading: 270
speed: 25
vehicle_type: "standard"
rating: 4.85
last_update: 1708632060
current_ride_id: null
Rides (PostgreSQL — ACID for ride lifecycle)
Table: rides
ride_id (PK) | uuid
rider_id (FK) | uuid
driver_id (FK) | uuid -- NULL until matched
status | enum('matching', 'driver_assigned', 'en_route_pickup',
'arrived', 'in_progress', 'completed', 'cancelled')
pickup_lat | decimal(9,6)
pickup_lng | decimal(9,6)
dest_lat | decimal(9,6)
dest_lng | decimal(9,6)
ride_type | enum('standard', 'pool', 'premium', 'xl')
surge_multiplier | decimal(3,2)
fare_cents | int -- NULL until completed
requested_at | timestamp
matched_at | timestamp
completed_at | timestamp
Geospatial Index (H3 or S2 cells, in-memory)
// H3 resolution 9 (~174m hexagons) for fine-grained driver lookup
Map<H3CellIndex, Set<DriverId>>
// On location update:
1. Remove driver from old cell
2. Add driver to new cell
3. On query: expand to neighboring cells for radius search
Why Redis GeoSet + H3?
- Redis GeoSet provides built-in
GEORADIUSfor quick nearest-driver lookups - H3 hexagonal grid used as secondary index for supply-demand heat maps and surge pricing zones
- PostgreSQL for ride records (durability and relational queries for billing, analytics)
5. High-Level Design (12 min)
Architecture
Driver App Rider App
│ (location updates every 4s) │ (ride request)
▼ ▼
Location Service Ride Service
→ Kafka (location-updates topic) → Matching Service
→ Location Processor: → Find candidates (geospatial query)
1. Update Redis GeoSet → Rank candidates (scoring)
2. Update H3 supply map → Dispatch to best driver
3. Update driver metadata → Wait for accept/decline
→ If decline → next candidate
→ If timeout → next candidate
Supply-Demand Service → If accepted → ride confirmed
→ Reads H3 supply map
→ Reads ride request rate Notification Service
→ Computes surge multiplier per zone → Push ride offer to driver
→ Push match result to rider
ETA Service
→ OSRM / Valhalla routing engine
→ Computes driving time from driver to pickup
→ Considers real-time traffic data
Matching Flow (Critical Path)
Rider requests ride
→ Ride Service creates ride record (status: matching)
→ Matching Service triggered:
1. Query Redis GeoSet: GEORADIUS from pickup, 5km radius, top 20 nearest
2. Filter: only drivers with matching vehicle_type, status=available
3. For each candidate (parallel):
→ ETA Service: compute drive time from driver location to pickup
4. Score each candidate:
score = w1/ETA + w2*driver_rating + w3*acceptance_rate - w4*surge_distance
5. Sort by score, pick top candidate
6. Send ride offer to driver #1 via WebSocket
7. Start 15-second timer
→ If accepted: lock driver (status=on_trip), notify rider
→ If declined/timeout: offer to driver #2
→ After 3 declines: expand radius to 10km, repeat
→ After 5 total attempts: notify rider "no drivers available"
Components
- Location Service: Ingests 1.25M location updates/sec. Publishes to Kafka for async processing. Stateless, horizontally scaled.
- Location Processor: Consumes Kafka, updates Redis GeoSet and H3 supply map. Partitioned by city/region.
- Ride Service: Manages ride lifecycle state machine. Saga coordinator for the match-dispatch-payment flow.
- Matching Service: Core intelligence. Queries geospatial index, runs scoring algorithm, orchestrates the offer-accept cycle.
- ETA Service: Wraps OSRM (Open Source Routing Machine) with real-time traffic adjustments. Caches common routes.
- Supply-Demand Service: Computes surge multipliers per H3 cell by comparing available drivers to incoming ride requests over a rolling 5-minute window.
- Notification Service: WebSocket connections to all active drivers and riders. Delivers match offers and status updates.
6. Deep Dives (15 min)
Deep Dive 1: Geospatial Indexing with H3 and Efficient Nearest-Driver Search
The problem: Finding the nearest available drivers among 5M moving points, where each point moves every 4 seconds. Traditional R-tree spatial indexes degrade with high update rates.
Solution: H3 hexagonal grid + Redis GeoSet hybrid
H3 indexing:
- Uber’s H3 system divides the earth into hexagonal cells at multiple resolutions
- Resolution 7 (~5.16 km2): used for surge pricing zones
- Resolution 9 (~0.105 km2, ~174m edge): used for driver lookup
- Each driver is assigned to exactly one H3 cell at resolution 9
Driver lookup algorithm:
1. Convert pickup location to H3 cell at resolution 9
2. Get the cell's k-ring neighbors (k=1 → 7 cells, k=2 → 19 cells, k=3 → 37 cells)
3. For each cell, get all available drivers (O(1) lookup per cell)
4. Compute straight-line distance to filter obvious non-candidates
5. For top 10-15 candidates, compute actual driving ETA via routing engine
6. If fewer than 5 candidates found, expand k-ring (k++)
Why H3 over S2 or geohash?
- Hexagonal cells have uniform distance from center to all edges (unlike square geohash cells where corners are sqrt(2) farther)
- All neighbors are equidistant (6 neighbors, each sharing an edge) — cleaner radius expansion
- H3 is what Uber actually uses in production, battle-tested for ride matching
Update throughput:
- Each location update: 1 Redis GEOADD (O(log N)) + 1 H3 cell reassignment (O(1) hash map update)
- 1.25M updates/sec distributed across 10 city-level shards = 125K updates/shard/sec
- Redis benchmarks show 500K+ GEOADD/sec on a single instance — well within budget
Deep Dive 2: Matching Algorithm — Closest vs. ETA-Based vs. Global Optimization
Closest driver (naive):
Sort drivers by straight-line distance to pickup → offer to closest
- Problem: Closest driver may be across a river, on a highway, or facing the wrong direction. A driver 1km away but heading toward the rider may arrive faster than one 500m away driving in the opposite direction.
ETA-based matching (Uber’s approach):
For each candidate driver:
ETA = routing_engine.compute_time(driver.location, pickup.location, traffic=real_time)
score = -ETA // minimize ETA
Pick driver with lowest ETA
- Better than distance but still greedy — optimizes each ride independently.
Batched global optimization (advanced):
Every 2 seconds, collect all unmatched ride requests in a city:
requests = [R1, R2, R3, ...]
available_drivers = [D1, D2, D3, ...]
Solve assignment problem:
Minimize total_wait_time = sum(ETA(Ri, Dj)) for all matched pairs (Ri, Dj)
Subject to: each driver assigned to at most 1 ride, each ride gets exactly 1 driver
Algorithm: Hungarian algorithm or auction algorithm
- Hungarian: O(n^3) — works for batches of ~100 requests
- Auction algorithm: faster in practice for sparse assignment matrices
Why batched matching is better:
Example with 2 riders (A, B) and 2 drivers (X, Y):
ETA(A→X) = 3 min, ETA(A→Y) = 5 min
ETA(B→X) = 4 min, ETA(B→Y) = 2 min
Greedy: A gets X (3 min), B gets Y (2 min) → total = 5 min
But if A arrived 1 second before B, greedy gives:
A gets X (3 min), B gets Y (2 min) → total = 5 min ✓
What if B arrived first?
B gets X (4 min), A gets Y (5 min) → total = 9 min ✗
Batched optimal: A→X (3 min), B→Y (2 min) → total = 5 min regardless of arrival order
Production approach: hybrid
- Immediate requests (rider waiting > 5 seconds) → greedy ETA-based matching
- Fresh requests (within 2-second batch window) → batched optimization
- This bounds worst-case wait time while improving average
Deep Dive 3: Supply-Demand Balancing and Surge Pricing
The problem: At 2 AM, there are few drivers. During a concert ending, there are 5,000 ride requests in 5 minutes from one location. Without surge pricing, all riders wait 30+ minutes.
Surge computation:
For each H3 cell at resolution 7 (city-block level):
supply = count(available_drivers in cell)
demand = count(ride_requests in last 5 minutes)
surge_multiplier = max(1.0, demand / (supply * capacity_factor))
capacity_factor = 3 (each driver can serve ~3 rides per hour)
Example: 10 drivers, 50 requests → 50 / (10 * 3) = 1.67x surge
Capped at 3.0x to prevent PR disasters
Dynamic supply repositioning:
- When surge is detected in a zone, send “demand heat map” to nearby idle drivers
- Nudge message: “Higher earnings in Financial District — 1.8x surge”
- Predictive: use historical data (events, weather, time of day) to pre-position drivers
- Stadium event ending at 10 PM → start nudging drivers to the area at 9:45 PM
Surge pricing mechanics:
Rider requests ride in surge zone:
1. Show surge multiplier upfront: "1.5x surge pricing — estimated fare $18-22"
2. Rider must confirm ("I accept the higher fare")
3. Surge multiplier locked at time of request (doesn't change during matching)
4. Driver sees earnings boost: base_fare * surge_multiplier
Fairness safeguards:
- Surge multiplier smoothed over 5-minute windows (no sudden 1.0x → 3.0x spikes)
- Maximum surge cap: 3.0x (configurable per city/regulation)
- During natural disasters: surge automatically disabled (regulatory requirement)
- Wait time guarantee: if surge is active, estimated pickup time must be under 10 minutes
7. Extensions (2 min)
- Ride pooling (UberPool): Match multiple riders heading in similar directions to the same driver. Solve a vehicle routing problem (VRP) — group requests with compatible routes, minimize total detour. Offer riders a discount (30%) for sharing.
- Driver ETA prediction with ML: Replace static routing with ML model trained on historical trip data. Features: time of day, weather, events, road conditions. Predict actual arrival time more accurately than shortest-path algorithms.
- Multi-modal matching: Extend to bikes, scooters, and public transit. Rider requests “cheapest option” → system returns options: UberX ($12, 8 min), scooter ($3, 15 min), bus+walk ($2.50, 25 min).
- Scheduled rides: Rider books a ride for 6 AM tomorrow. System pre-assigns a driver 15 minutes before, guaranteeing availability. Requires demand forecasting and driver incentive system.
- International expansion: Each city runs as an independent cell (its own Redis, matching service). No cross-city dependency. Deploy in new cities by spinning up a city-cell and configuring local parameters (base fare, surge caps, vehicle types).