1. Requirements & Scope (5 min)

Functional Requirements

  1. Customers can browse nearby restaurants, view menus with real-time item availability, and place orders
  2. Orders flow through a state machine: placed → confirmed by restaurant → preparing → ready for pickup → picked up → delivered
  3. Assign delivery drivers to orders based on proximity, current load, and estimated restaurant prep time
  4. Real-time tracking of driver location from restaurant to customer doorstep
  5. Estimate and display accurate ETAs at each stage (restaurant prep, driver pickup, delivery)

Non-Functional Requirements

  • Availability: 99.99% during meal-time peaks (11 AM - 1 PM, 5 PM - 9 PM). A 10-minute outage during dinner rush can cost millions.
  • Latency: Menu browsing < 100ms, order placement < 500ms, location updates < 1 second.
  • Consistency: Order state must be strongly consistent (no duplicate orders, no lost payments). Menu prices can be eventually consistent (seconds-stale acceptable).
  • Scale: 500K restaurants, 30M orders/month (~12 orders/sec average, peak 100/sec), 200K concurrent drivers.
  • Reliability: Payment capture must be exactly-once. Driver assignment must avoid double-booking.

2. Estimation (3 min)

Traffic

  • Menu browsing: 5M DAU × 5 searches × 3 restaurant views = 75M page loads/day = ~870 QPS (peak 3x = 2,600 QPS)
  • Orders: 1M orders/day = ~12/sec average, peak (dinner rush) = 100/sec
  • Driver location updates: 200K drivers × 1 update/4s = 50K updates/sec
  • Order status updates: 1M orders × 6 state transitions = 6M events/day = 70/sec

Storage

  • Restaurants + menus: 500K restaurants × 50 items × 2KB = 50GB
  • Orders: 30M/month × 3KB = 90GB/month, ~1TB/year
  • Driver locations: Real-time only (Redis), ~200K × 64 bytes = 12.8MB in memory
  • Order tracking history: 30M/month × 100 location points × 32 bytes = 96GB/month

Key Insight

This is a three-sided marketplace (customers, restaurants, drivers) with a complex orchestration problem. The hardest challenge is ETA accuracy — it depends on restaurant prep time (variable, 5-45 minutes), driver travel time (traffic-dependent), and coordinating pickup timing so the driver arrives when food is ready (not 15 minutes early waiting, not 10 minutes late with cold food).


3. API Design (3 min)

Restaurant Discovery

GET /v1/restaurants?lat=37.77&lng=-122.41&cuisine=italian&sort=recommended&page=1
  Response: { "restaurants": [
    { "id": "rest_123", "name": "Luigi's", "cuisine": "Italian",
      "rating": 4.5, "delivery_fee_cents": 299, "estimated_delivery_min": 35,
      "is_open": true, "distance_km": 1.2 }
  ], "total": 42 }

GET /v1/restaurants/{id}/menu
  Response: { "categories": [
    { "name": "Pasta", "items": [
      { "id": "item_456", "name": "Spaghetti Carbonara", "price_cents": 1499,
        "description": "...", "image_url": "...", "available": true,
        "customizations": [
          { "name": "Extra cheese", "price_cents": 200 },
          { "name": "Gluten-free pasta", "price_cents": 300 }
        ] }
    ] }
  ] }

Order Placement

POST /v1/orders
  Body: {
    "restaurant_id": "rest_123",
    "items": [
      { "item_id": "item_456", "quantity": 2,
        "customizations": ["extra_cheese"] }
    ],
    "delivery_address": { "lat": 37.78, "lng": -122.42, "address": "..." },
    "payment_method_id": "pm_abc",
    "tip_cents": 500,
    "special_instructions": "Ring doorbell"
  }
  Response 201: {
    "order_id": "ord_789",
    "status": "placed",
    "estimated_delivery": "2024-02-22T18:45:00Z",
    "total_cents": 3698,
    "breakdown": { "subtotal": 2998, "delivery_fee": 299, "service_fee": 150,
                   "tax": 251, "tip": 500 }
  }

Real-Time Tracking

// WebSocket connection
WS /v1/orders/{order_id}/track
  Server pushes:
  { "status": "preparing", "driver": null, "eta_minutes": 28 }
  { "status": "ready_for_pickup", "driver": { "name": "Alex", "lat": 37.771, "lng": -122.413 }, "eta_minutes": 15 }
  { "status": "picked_up", "driver": { "lat": 37.775, "lng": -122.418 }, "eta_minutes": 8 }
  { "status": "delivered", "delivered_at": "2024-02-22T18:43:00Z" }

Key Decisions

  • Separate service fee from delivery fee (transparent pricing builds trust)
  • Customizations are additive pricing (not separate SKUs) to keep menu management simple
  • WebSocket for order tracking (not polling) — reduces server load during peak

4. Data Model (3 min)

Restaurants & Menus (PostgreSQL + Redis cache)

Table: restaurants
  restaurant_id    (PK) | uuid
  name                  | varchar(200)
  cuisine_type          | varchar(50)[]       -- array of cuisines
  lat                   | decimal(9,6)
  lng                   | decimal(9,6)
  rating_avg            | decimal(2,1)
  prep_time_avg_min     | int                 -- historical average
  is_active             | boolean
  operating_hours       | jsonb               -- { "mon": {"open": "11:00", "close": "22:00"} }

Table: menu_items
  item_id          (PK) | uuid
  restaurant_id    (FK) | uuid
  category              | varchar(100)
  name                  | varchar(200)
  description           | text
  price_cents           | int
  image_url             | varchar(500)
  is_available          | boolean             -- toggled by restaurant in real-time
  prep_time_min         | int
  customizations        | jsonb

Orders (PostgreSQL — ACID required)

Table: orders
  order_id         (PK) | uuid
  customer_id      (FK) | uuid
  restaurant_id    (FK) | uuid
  driver_id        (FK) | uuid               -- NULL until assigned
  status                | enum('placed', 'confirmed', 'preparing', 'ready',
                                'picked_up', 'delivered', 'cancelled')
  delivery_lat          | decimal(9,6)
  delivery_lng          | decimal(9,6)
  delivery_address      | text
  subtotal_cents        | int
  delivery_fee_cents    | int
  service_fee_cents     | int
  tax_cents             | int
  tip_cents             | int
  total_cents           | int
  surge_multiplier      | decimal(3,2)
  estimated_delivery_at | timestamp
  placed_at             | timestamp
  delivered_at          | timestamp
  special_instructions  | text

Table: order_items
  order_item_id    (PK) | uuid
  order_id         (FK) | uuid
  item_id          (FK) | uuid
  quantity              | int
  unit_price_cents      | int                 -- snapshot at order time
  customizations        | jsonb

Driver State (Redis — real-time)

GEOADD drivers:available {lng} {lat} {driver_id}

HSET driver:{driver_id}
  status: "available"      // available, en_route_pickup, at_restaurant, delivering
  current_order_id: null
  lat: 37.775
  lng: -122.418
  last_update: 1708632060
  current_capacity: 2      // can carry 2 more orders (for batching)

Why PostgreSQL + Redis?

  • PostgreSQL: ACID guarantees for orders and payments (financial data must never be lost or corrupted)
  • Redis: real-time driver locations and restaurant availability (ephemeral, frequently updated)
  • Menu data cached in Redis with 5-minute TTL (reduces database load for the most common read pattern)

5. High-Level Design (12 min)

Architecture

Customer App                        Restaurant App                  Driver App
  │                                   │                               │
  ▼                                   ▼                               ▼
API Gateway (auth, rate limit)
  │
  ├→ Restaurant Service → PostgreSQL + Redis (menus, availability)
  ├→ Search Service → Elasticsearch (restaurant discovery, ranking)
  ├→ Order Service → PostgreSQL (order lifecycle)
  │    → Kafka (order-events topic)
  │      → Driver Assignment Service
  │      → Notification Service
  │      → Analytics Pipeline
  ├→ Payment Service → Stripe (authorization, capture, driver payout)
  ├→ Tracking Service → Redis (driver locations) + WebSocket Server
  ├→ ETA Service → ML model + routing engine
  └→ Pricing Service → surge multiplier + fee calculation

Driver Assignment Flow:
  Order confirmed by restaurant
    → Driver Assignment Service:
      1. Query nearby available drivers (Redis GEORADIUS)
      2. Score candidates (proximity, current orders, rating)
      3. Offer to best driver (push notification)
      4. Driver accepts/declines (15s timeout)
      5. If declined → next candidate
      6. If accepted → update driver status, notify customer

Order Lifecycle (State Machine)

PLACED → (restaurant tablet notification)
  → CONFIRMED → (kitchen starts preparing)
    → PREPARING → (restaurant marks "ready")
      → READY_FOR_PICKUP → (driver arrives, picks up)
        → PICKED_UP → (driver heading to customer)
          → DELIVERED

Cancellation allowed:
  PLACED → CANCELLED (full refund)
  CONFIRMED → CANCELLED (full refund)
  PREPARING → CANCELLED (partial refund, restaurant compensated)
  READY+ → cannot cancel (too late)

Components

  1. Restaurant Service: Manages restaurant profiles, menus, operating hours. Restaurants update item availability in real-time via tablet app.
  2. Search Service: Elasticsearch-based discovery. Ranks by relevance, distance, rating, delivery time, promotions. Personalized (user’s past orders).
  3. Order Service: Saga orchestrator for the order flow. Coordinates between payment authorization, restaurant confirmation, driver assignment.
  4. Payment Service: Stripe integration. Authorizes payment at order placement, captures on delivery. Splits payout: restaurant gets food cost, platform gets fees, driver gets delivery fee + tip.
  5. Driver Assignment Service: Matches orders to drivers. Considers prep time — doesn’t send driver until food is ~5 min from ready.
  6. Tracking Service: Maintains driver locations in Redis. Pushes updates to customer via WebSocket every 3 seconds during active delivery.
  7. ETA Service: Predicts total delivery time = prep_time (ML model per restaurant) + travel_time (routing engine with live traffic).
  8. Pricing Service: Computes delivery fee (base + distance), service fee, tax, and surge multiplier based on demand in the area.
  9. Notification Service: Push notifications to all three parties at each state transition. SMS fallback for critical updates (order ready, driver arriving).

6. Deep Dives (15 min)

Deep Dive 1: ETA Estimation — The Hardest Problem

Why ETAs are hard: The delivery ETA = prep_time + wait_time_at_restaurant + travel_time. Each component is uncertain.

Restaurant prep time prediction:

Features for ML model (per restaurant):
  - Historical prep times for this restaurant (median, p75, p90)
  - Current order queue depth (how many active orders at this restaurant right now)
  - Time of day / day of week (lunch rush vs. Tuesday 3 PM)
  - Order complexity (number of items, specific slow items like "well-done steak")
  - Restaurant's recent performance (are they running behind today?)

Model: Gradient Boosted Trees (XGBoost)
  Trained on: millions of historical order prep times
  Output: predicted prep_time_minutes + confidence interval
  Updated: daily retraining

Example predictions:
  - McDonald's, 2 items, no rush → 8 min (high confidence)
  - Sushi restaurant, 5 items, Friday 7 PM → 35 min (low confidence, wide interval)

Travel time estimation:

OSRM routing engine + real-time traffic adjustments:
  base_travel_time = osrm.route(restaurant_loc, customer_loc).duration
  traffic_multiplier = traffic_service.get_factor(area, time_of_day)
  adjusted_travel_time = base_travel_time * traffic_multiplier

  Plus: pickup time at restaurant (avg 3 min for driver to park, walk in, get food)
  Plus: dropoff time (avg 2 min for driver to find parking, walk to door)

Coordinating driver dispatch with prep time:

Problem: If we assign a driver immediately when order is placed,
  the driver arrives at the restaurant in 5 min but food takes 25 min.
  → Driver waits 20 min (wasted time, unhappy driver).

Solution: Delayed dispatch
  predicted_prep_time = 25 min
  driver_travel_to_restaurant = 8 min
  optimal_dispatch_time = order_placed + (25 - 8 - 3) = 14 min after order

  Dispatch driver 14 minutes after order is placed.
  Driver arrives in 8 min (minute 22).
  Food ready at minute 25. Driver waits 3 min (acceptable buffer).

  Adjustment: if restaurant marks "ready" earlier than predicted,
  immediately dispatch driver (or reprioritize an already-dispatched one).

Continuous ETA updates:

  • Re-compute ETA every 30 seconds during active order
  • If prep is taking longer than predicted → push updated ETA to customer
  • If driver is stuck in traffic → adjust delivery ETA accordingly
  • Never show ETA going backwards (increasing) by more than 5 minutes without an explanation (“Your restaurant is experiencing high demand”)

Deep Dive 2: Driver Assignment and Delivery Batching

Single-order assignment:

When an order needs a driver:
  1. Find available drivers within 5km of restaurant (Redis GEORADIUS)
  2. For each candidate, compute score:
     score = w1 * (1/eta_to_restaurant)          // prefer closer drivers
           + w2 * driver_rating                   // prefer higher-rated
           + w3 * acceptance_rate                  // prefer reliable drivers
           - w4 * active_order_count              // prefer less loaded drivers
  3. Offer to top-scored driver (push notification, 15s timeout)
  4. If declined/timeout → offer to #2, etc.
  5. After 3 declines → expand radius to 10km

Delivery batching (key efficiency optimization):

Problem: Driver picks up from restaurant A, delivers to customer.
  Then picks up from restaurant B (1 block away from A), delivers.
  → Total: 2 trips to same area, inefficient.

Solution: Batch nearby orders to the same driver.
  When a new order comes in for restaurant A:
    Check if any driver is currently heading to (or near) restaurant A
    AND has capacity for another order
    AND the detour for the second delivery adds < 10 min to first customer's ETA.

  If yes: assign both orders to same driver.
  Customer 1's ETA increases by ~5 min, but delivery fee is discounted 20%.
  Driver earns more per hour (two deliveries, one trip).
  Platform saves on driver costs.

Batching algorithm:

For each new order O:
  1. Find drivers within 3km of O's restaurant who:
     - Are currently en_route_to or at a nearby restaurant
     - Have capacity (current_orders < max_batch_size, typically 2)
  2. For each candidate batch:
     - Compute detour_time for existing customer(s)
     - If detour_time < 10 min AND existing customer's new ETA is acceptable:
       → Batch O with this driver
  3. If no batch found → assign as single order (standard flow)

Constraints:
  - Never batch more than 3 orders per driver (quality degrades)
  - Never add > 10 min to any customer's delivery time
  - Hot food deadline: never delay a "ready" order by > 5 min for batching

Deep Dive 3: Surge Pricing and Supply-Demand Balancing

Surge pricing mechanics:

For each geographic zone (H3 cell, ~1km radius):
  demand = count(orders in last 10 minutes) / 10  // orders per minute
  supply = count(available drivers in zone)
  capacity = supply * deliveries_per_hour_per_driver (avg 3)

  if demand > capacity * 0.8:  // approaching capacity
    surge_multiplier = 1.0 + 0.5 * (demand / capacity - 0.8) / 0.2
    // Linearly increases from 1.0x to 1.5x as utilization goes from 80% to 100%
    // Capped at 2.0x

  Applies to: delivery fee only (not food price — restaurant sets that)
  Display: "Delivery fee: $4.99 (usually $2.99) — high demand in your area"

Why surge pricing works:

  1. Demand reduction: Price-sensitive customers wait for surge to end, reducing overload
  2. Supply increase: Higher delivery fee attracts more drivers to the surge zone
  3. Fairness signal: Customers willing to pay more during peak get faster service

Anti-gaming measures:

  • Surge calculated at order placement time and locked (doesn’t change mid-order)
  • Minimum surge duration: 5 minutes (prevents rapid oscillation)
  • Surge caps enforced per city (regulatory compliance in some areas)
  • Driver incentives independent of customer surge: drivers get bonus for operating in high-demand zones regardless of whether customer pays surge

Supply repositioning:

  • When demand is predicted to spike (events, weather, mealtimes), proactively notify drivers: “Earn 1.5x in Downtown area starting at 6 PM”
  • ML model predicts demand 30 minutes ahead using: time of day, day of week, weather, local events, historical patterns
  • Pre-position drivers by offering “quest” bonuses: “Complete 3 deliveries in Financial District between 6-8 PM, earn $15 bonus”

7. Extensions (2 min)

  • Restaurant partner dashboard: Real-time analytics for restaurant owners — order volume, popular items, average prep time, customer ratings. Help them optimize menu and staffing.
  • Group ordering: Multiple people add items to a shared cart from the same restaurant. Split payment at checkout. Requires cart locking and conflict resolution for simultaneous edits.
  • Subscription model (DashPass): Monthly subscription ($9.99) for free delivery on orders over $12. Requires tracking subscriber status, computing savings, and managing renewals/cancellations.
  • Kitchen display system (KDS): Replace paper tickets with a digital display in the kitchen. Shows incoming orders, prioritization, and prep time targets. Integrates with order status updates.
  • Contactless delivery with photo proof: Driver takes photo of food at the door, GPS-stamped. Reduces “food not delivered” disputes. Photo stored for 7 days, linked to order.