Design a Price Alert System (Google Flights)

Table of Contents

1. Requirements & Scope (5 min)
- Functional Requirements
- Non-Functional Requirements
2. Estimation (3 min)
3. API Design (3 min)
4. Data Model (3 min)
5. High-Level Design (12 min)
6. Deep Dives (15 min)
7. Extensions (2 min)

This content is password protected

1. Requirements & Scope (5 min)

Functional Requirements

Users can set price alerts for a specific route (origin, destination, date range, cabin class) with a target price or “notify on any drop”
System continuously monitors flight prices by polling airline APIs / aggregators and detects meaningful price changes
Send notifications (email, push) when a tracked route’s price drops below the user’s threshold or changes significantly
Show price history and trends for a route (price graph over last 30-90 days)
Support alert management: list, edit, pause, delete alerts; set expiration (auto-delete after travel date)

Non-Functional Requirements

Availability: 99.9% for alert creation/management; price monitoring can tolerate brief outages (users won’t notice a 5-minute gap in polling)
Latency: Alert creation < 200ms. Notification delivery within 15 minutes of a price change (not real-time — flights don’t change by the second).
Consistency: A user should never miss a significant price drop. False positives (alerting on a stale price) are worse than a 15-minute delay.
Scale: 100M active alerts across 50M users. 500K unique routes being monitored. 10M price checks/hour.
Cost Efficiency: Airline API calls are expensive (rate-limited, sometimes paid). Minimize redundant polling.

2. Estimation (3 min)

Alerts

50M users, average 2 active alerts each = 100M active alerts
Alert creation/deletion: ~1M/day (relatively low write volume)

Price Monitoring

500K unique routes being monitored
Each route checked every 30 minutes = 500K × 48 checks/day = 24M price checks/day ≈ 280 checks/sec
Popular routes (JFK→LAX) shared by 100K+ alerts; long-tail routes shared by < 10 alerts
Key optimization: check routes, not individual alerts. 500K route checks serve 100M alerts.

Notifications

Assume 5% of checks find a meaningful price change = 1.2M price changes/day
Each change triggers alerts for all subscribed users. Average 200 users per route change = 240M notifications/day ≈ 2,800/sec
Peak: 3-5x average during fare sales = ~10K notifications/sec

Storage

Alert records: 100M × 500 bytes = 50GB
Price history: 500K routes × 365 days × 48 data points × 50 bytes = 4.4TB/year
Route metadata/cache: 500K × 2KB = 1GB

3. API Design (3 min)

Alert Management

POST /alerts
  Body: {
    "origin": "JFK",
    "destination": "LAX",
    "departure_date_start": "2026-06-01",
    "departure_date_end": "2026-06-07",
    "return_date_start": "2026-06-08",
    "return_date_end": "2026-06-14",
    "cabin_class": "economy",
    "max_price": 250,                    // null = notify on any significant drop
    "notification_channels": ["email", "push"],
    "passengers": 1
  }
  Response 201: { "alert_id": "alt_abc123", "current_price": 312, ... }

GET /alerts?user_id=u_123&status=active
  Response: [list of alert objects with current prices]

DELETE /alerts/{alert_id}

PATCH /alerts/{alert_id}
  Body: { "max_price": 200, "paused": false }

Price Data

GET /routes/{origin}/{destination}/prices
  Query: departure_start, departure_end, cabin_class, lookback_days=30
  Response: {
    "current_price": 312,
    "price_history": [
      {"date": "2026-02-20", "min_price": 318, "median_price": 345},
      {"date": "2026-02-21", "min_price": 312, "median_price": 340},
      ...
    ],
    "trend": "declining",
    "typical_range": {"low": 280, "high": 420}
  }

Key Decisions

Alerts are defined on route+date_range, not specific flights. The system finds the cheapest option matching the criteria.
Date ranges (not exact dates) because most travelers have flexibility — this dramatically increases alert match rates.

4. Data Model (3 min)

Alerts (PostgreSQL)

Table: alerts
  alert_id        (PK) | uuid
  user_id         (FK) | uuid (indexed)
  route_key             | varchar(50) (indexed) -- "JFK-LAX-2026-06-economy"
  origin                | char(3)
  destination           | char(3)
  departure_start       | date
  departure_end         | date
  return_start          | date (nullable)
  return_end            | date (nullable)
  cabin_class           | enum('economy', 'premium_economy', 'business', 'first')
  max_price             | decimal (nullable)
  passengers            | int
  status                | enum('active', 'paused', 'expired', 'triggered')
  notification_channels | jsonb
  created_at            | timestamp
  expires_at            | timestamp -- auto-set to departure_start

Monitored Routes (PostgreSQL + Redis cache)

Table: monitored_routes
  route_key       (PK) | varchar(50)
  origin                | char(3)
  destination           | char(3)
  date_range            | daterange
  cabin_class           | varchar(20)
  alert_count           | int -- denormalized, number of active alerts
  last_checked_at       | timestamp
  last_price            | decimal
  check_interval_min    | int -- 15 for popular, 60 for long-tail
  next_check_at         | timestamp (indexed)

Price History (TimescaleDB or ClickHouse)

Table: price_history
  route_key             | varchar(50)
  checked_at            | timestamp
  min_price             | decimal
  median_price          | decimal
  source                | varchar(50) -- airline, aggregator name
  flight_options        | jsonb -- top 3-5 cheapest options with details

Why These Choices

PostgreSQL for alerts and routes: strong consistency, complex queries (find all alerts for a route), transactional status updates
TimescaleDB for price history: time-series optimized, automatic partitioning by time, efficient range queries for price graphs
Redis for route check scheduling: sorted set with next_check_at as score — O(log N) to get the next routes to check

5. High-Level Design (12 min)

Price Monitoring Pipeline

Scheduler (Redis Sorted Set: next_check_at)
  → every second: ZRANGEBYSCORE routes 0 {now} LIMIT 100
  → dispatch route checks to worker pool

Price Check Workers (horizontally scaled)
  → For each route:
    1. Call airline APIs / aggregator APIs (Google Flights, Skyscanner, etc.)
    2. Parse response → extract min price for the route criteria
    3. Compare with last_price in DB
    4. If significant change detected:
       → Write new price to price_history
       → Update last_price in monitored_routes
       → Publish PriceChangeEvent to Kafka
    5. Update next_check_at in scheduler

Kafka: PriceChangeEvent
  → Alert Matcher Service
    → Query: SELECT * FROM alerts WHERE route_key = ? AND status = 'active'
    → For each matching alert:
       → If new_price <= max_price (or significant drop for "any drop" alerts):
         → Enqueue notification job

Notification Service
  → Read from notification queue
  → Deduplicate (don't send same user same alert twice in 1 hour)
  → Render email/push template with price details
  → Send via email (SES) / push (FCM/APNs)
  → Update alert status if appropriate (mark as triggered)

Components

Route Scheduler: Redis sorted set of routes ordered by next_check_at. A cron-like process pops due routes and dispatches them.
Price Check Workers: Stateless workers that call external APIs. Auto-scale based on queue depth. Handle rate limits, retries, API key rotation.
Route Consolidator: Background job that merges overlapping date ranges across alerts into a minimal set of monitored routes. Runs on alert creation/deletion.
Alert Matcher: On price change, finds all alerts affected. Uses indexed route_key lookup.
Notification Service: Manages delivery across channels. Handles batching (don’t send 10 alerts at once — batch into a digest), deduplication, and delivery tracking.
Price History Service: Serves price graphs and trend analysis. Queries TimescaleDB with pre-computed daily aggregates.
API Gateway: Handles alert CRUD, price queries, user authentication.

Route Consolidation Example

Alert 1: JFK→LAX, Jun 1-7, economy
Alert 2: JFK→LAX, Jun 3-10, economy
Alert 3: JFK→LAX, Jun 1-7, business

Monitored routes:
  Route A: JFK→LAX, Jun 1-10, economy (covers alerts 1 + 2)
  Route B: JFK→LAX, Jun 1-7, business (covers alert 3)

This reduces 100M alerts to ~500K monitored routes — a 200x reduction in API calls.

6. Deep Dives (15 min)

Deep Dive 1: Efficient Price Polling and Cost Optimization

The core challenge: Airline APIs are rate-limited (e.g., 100 requests/sec per API key) and sometimes paid ($0.01-$0.10 per query). Naively checking 500K routes every 30 minutes costs ~$500K/year in API fees alone.

Adaptive polling frequency:

Priority tiers based on:
  1. Alert count: routes with more alerts are more valuable
  2. Travel date proximity: flights departing in 3 days change price more often than 6 months out
  3. Historical volatility: routes with frequent price swings get checked more often
  4. Time of day: airlines update prices more during business hours

Tier 1 (hot): >1000 alerts, travel <7 days → check every 15 min
Tier 2 (warm): >100 alerts, travel <30 days → check every 30 min
Tier 3 (cool): >10 alerts, travel <90 days → check every 2 hours
Tier 4 (cold): <10 alerts, travel >90 days → check every 6 hours

Staggering and jittering:

Don’t check all Tier 1 routes at exactly :00, :15, :30, :45. Add random jitter (0-5 min) to spread load evenly.
Round-robin across API providers to stay under each provider’s rate limit.

Caching and deduplication:

If two routes overlap (JFK→LAX Jun 1-7 and JFK→LAX Jun 3-10), a single API call for JFK→LAX Jun 1-10 covers both. The Route Consolidator merges these.
Cache API responses for 5 minutes. If a route check returns the same results, skip downstream processing.

Fallback data sources:

Primary: airline direct APIs (most accurate)
Secondary: aggregator APIs (Skyscanner, Kiwi — broader coverage, slightly delayed)
Tertiary: scraping (last resort, fragile, use headless browser)
Cross-validate prices across sources to catch stale data

Deep Dive 2: Change Detection and Alert Triggering

What counts as a “significant” price change?

Naive approach: alert if price < threshold. Problem: prices fluctuate by $5-10 constantly. Users get spammed.

Smart change detection:

Trigger notification if ANY of:
  1. Price drops below user's max_price (explicit threshold)
  2. Price drops > 10% from the 7-day moving average
  3. Price reaches the lowest point in the last 30 days
  4. Price drops > $50 from previous check (absolute threshold)

Suppress notification if:
  1. Same alert was notified in the last 4 hours (debounce)
  2. Price bounced back up within 30 minutes (transient glitch)
  3. Only 1 seat at the low price (likely a data artifact)

Confirmation check: When a price drop is detected, re-check the price after 5 minutes before sending notifications. This catches transient pricing errors and “ghost fares” that disappear immediately.

Batching notifications:

If a user has 5 alerts and 3 trigger in the same hour:
  → Don't send 3 separate emails
  → Batch into a single digest email: "3 of your watched flights dropped in price"
  → Include a CTA to view all alerts in the app

Deep Dive 3: Scaling the Alert Matching

The fan-out problem: When a popular route (JFK→LAX) drops in price, 100K+ alerts may match. We need to find and notify all of them quickly.

Approach 1: Database query (simple, works to ~10M alerts)

SELECT alert_id, user_id, max_price, notification_channels
FROM alerts
WHERE route_key = 'JFK-LAX-2026-06-economy'
  AND status = 'active'
  AND (max_price IS NULL OR max_price >= 199.00);

With an index on (route_key, status), this query returns 100K rows in ~50ms.

Approach 2: Pre-built alert index in Redis (for 100M+ alerts)

Key: alerts:route:JFK-LAX-2026-06-economy
Type: Sorted Set
Score: max_price (or 0 for "any drop" alerts)
Members: alert_id

On price change to $199:
  ZRANGEBYSCORE key 199 +inf → all alerts with max_price >= $199

This is O(log N + M) where M is matching alerts — much faster for large cardinalities.

Processing 100K matched alerts:

Don’t process sequentially. Publish to a notification queue (SQS/Kafka) with high parallelism.
100K notifications at 5ms each (template rendering + API call) = 500 seconds sequentially. With 100 workers = 5 seconds.
Use batch APIs where possible (SES supports 50 emails per batch call).

7. Extensions (2 min)

Price prediction: Use historical data to train a model that predicts whether prices will go up or down. Show “buy now” vs “wait” recommendations with confidence scores.
Flexible date search: Instead of a fixed date range, alert users about the cheapest day to fly in a given month. Requires checking many more date combinations — use sampling.
Multi-leg trips: Support complex itineraries (JFK→LAX→SFO→JFK). Each leg is independently monitored but the alert triggers on total trip price.
Fare class tracking: Track not just price but fare class (refundable, basic economy, etc.). Alert when a better fare class becomes available at the same price.
Social proof: “23,000 people are tracking this route” or “prices usually drop 3 weeks before departure for this route” — aggregated insights from the alert data.

1. Requirements & Scope (5 min)#

Functional Requirements#

Non-Functional Requirements#

2. Estimation (3 min)#

Alerts#

Price Monitoring#

Notifications#

Storage#

3. API Design (3 min)#

Alert Management#

Price Data#

Key Decisions#

4. Data Model (3 min)#

Alerts (PostgreSQL)#

Monitored Routes (PostgreSQL + Redis cache)#

Price History (TimescaleDB or ClickHouse)#

Why These Choices#

5. High-Level Design (12 min)#

Price Monitoring Pipeline#

Components#

Route Consolidation Example#

6. Deep Dives (15 min)#

Deep Dive 1: Efficient Price Polling and Cost Optimization#

Deep Dive 2: Change Detection and Alert Triggering#

Deep Dive 3: Scaling the Alert Matching#

7. Extensions (2 min)#