1. Requirements & Scope (5 min)
Functional Requirements
- Users can set price alerts for a specific route (origin, destination, date range, cabin class) with a target price or “notify on any drop”
- System continuously monitors flight prices by polling airline APIs / aggregators and detects meaningful price changes
- Send notifications (email, push) when a tracked route’s price drops below the user’s threshold or changes significantly
- Show price history and trends for a route (price graph over last 30-90 days)
- Support alert management: list, edit, pause, delete alerts; set expiration (auto-delete after travel date)
Non-Functional Requirements
- Availability: 99.9% for alert creation/management; price monitoring can tolerate brief outages (users won’t notice a 5-minute gap in polling)
- Latency: Alert creation < 200ms. Notification delivery within 15 minutes of a price change (not real-time — flights don’t change by the second).
- Consistency: A user should never miss a significant price drop. False positives (alerting on a stale price) are worse than a 15-minute delay.
- Scale: 100M active alerts across 50M users. 500K unique routes being monitored. 10M price checks/hour.
- Cost Efficiency: Airline API calls are expensive (rate-limited, sometimes paid). Minimize redundant polling.
2. Estimation (3 min)
Alerts
- 50M users, average 2 active alerts each = 100M active alerts
- Alert creation/deletion: ~1M/day (relatively low write volume)
Price Monitoring
- 500K unique routes being monitored
- Each route checked every 30 minutes = 500K × 48 checks/day = 24M price checks/day ≈ 280 checks/sec
- Popular routes (JFK→LAX) shared by 100K+ alerts; long-tail routes shared by < 10 alerts
- Key optimization: check routes, not individual alerts. 500K route checks serve 100M alerts.
Notifications
- Assume 5% of checks find a meaningful price change = 1.2M price changes/day
- Each change triggers alerts for all subscribed users. Average 200 users per route change = 240M notifications/day ≈ 2,800/sec
- Peak: 3-5x average during fare sales = ~10K notifications/sec
Storage
- Alert records: 100M × 500 bytes = 50GB
- Price history: 500K routes × 365 days × 48 data points × 50 bytes = 4.4TB/year
- Route metadata/cache: 500K × 2KB = 1GB
3. API Design (3 min)
Alert Management
POST /alerts
Body: {
"origin": "JFK",
"destination": "LAX",
"departure_date_start": "2026-06-01",
"departure_date_end": "2026-06-07",
"return_date_start": "2026-06-08",
"return_date_end": "2026-06-14",
"cabin_class": "economy",
"max_price": 250, // null = notify on any significant drop
"notification_channels": ["email", "push"],
"passengers": 1
}
Response 201: { "alert_id": "alt_abc123", "current_price": 312, ... }
GET /alerts?user_id=u_123&status=active
Response: [list of alert objects with current prices]
DELETE /alerts/{alert_id}
PATCH /alerts/{alert_id}
Body: { "max_price": 200, "paused": false }
Price Data
GET /routes/{origin}/{destination}/prices
Query: departure_start, departure_end, cabin_class, lookback_days=30
Response: {
"current_price": 312,
"price_history": [
{"date": "2026-02-20", "min_price": 318, "median_price": 345},
{"date": "2026-02-21", "min_price": 312, "median_price": 340},
...
],
"trend": "declining",
"typical_range": {"low": 280, "high": 420}
}
Key Decisions
- Alerts are defined on route+date_range, not specific flights. The system finds the cheapest option matching the criteria.
- Date ranges (not exact dates) because most travelers have flexibility — this dramatically increases alert match rates.
4. Data Model (3 min)
Alerts (PostgreSQL)
Table: alerts
alert_id (PK) | uuid
user_id (FK) | uuid (indexed)
route_key | varchar(50) (indexed) -- "JFK-LAX-2026-06-economy"
origin | char(3)
destination | char(3)
departure_start | date
departure_end | date
return_start | date (nullable)
return_end | date (nullable)
cabin_class | enum('economy', 'premium_economy', 'business', 'first')
max_price | decimal (nullable)
passengers | int
status | enum('active', 'paused', 'expired', 'triggered')
notification_channels | jsonb
created_at | timestamp
expires_at | timestamp -- auto-set to departure_start
Monitored Routes (PostgreSQL + Redis cache)
Table: monitored_routes
route_key (PK) | varchar(50)
origin | char(3)
destination | char(3)
date_range | daterange
cabin_class | varchar(20)
alert_count | int -- denormalized, number of active alerts
last_checked_at | timestamp
last_price | decimal
check_interval_min | int -- 15 for popular, 60 for long-tail
next_check_at | timestamp (indexed)
Price History (TimescaleDB or ClickHouse)
Table: price_history
route_key | varchar(50)
checked_at | timestamp
min_price | decimal
median_price | decimal
source | varchar(50) -- airline, aggregator name
flight_options | jsonb -- top 3-5 cheapest options with details
Why These Choices
- PostgreSQL for alerts and routes: strong consistency, complex queries (find all alerts for a route), transactional status updates
- TimescaleDB for price history: time-series optimized, automatic partitioning by time, efficient range queries for price graphs
- Redis for route check scheduling: sorted set with next_check_at as score — O(log N) to get the next routes to check
5. High-Level Design (12 min)
Price Monitoring Pipeline
Scheduler (Redis Sorted Set: next_check_at)
→ every second: ZRANGEBYSCORE routes 0 {now} LIMIT 100
→ dispatch route checks to worker pool
Price Check Workers (horizontally scaled)
→ For each route:
1. Call airline APIs / aggregator APIs (Google Flights, Skyscanner, etc.)
2. Parse response → extract min price for the route criteria
3. Compare with last_price in DB
4. If significant change detected:
→ Write new price to price_history
→ Update last_price in monitored_routes
→ Publish PriceChangeEvent to Kafka
5. Update next_check_at in scheduler
Kafka: PriceChangeEvent
→ Alert Matcher Service
→ Query: SELECT * FROM alerts WHERE route_key = ? AND status = 'active'
→ For each matching alert:
→ If new_price <= max_price (or significant drop for "any drop" alerts):
→ Enqueue notification job
Notification Service
→ Read from notification queue
→ Deduplicate (don't send same user same alert twice in 1 hour)
→ Render email/push template with price details
→ Send via email (SES) / push (FCM/APNs)
→ Update alert status if appropriate (mark as triggered)
Components
- Route Scheduler: Redis sorted set of routes ordered by next_check_at. A cron-like process pops due routes and dispatches them.
- Price Check Workers: Stateless workers that call external APIs. Auto-scale based on queue depth. Handle rate limits, retries, API key rotation.
- Route Consolidator: Background job that merges overlapping date ranges across alerts into a minimal set of monitored routes. Runs on alert creation/deletion.
- Alert Matcher: On price change, finds all alerts affected. Uses indexed route_key lookup.
- Notification Service: Manages delivery across channels. Handles batching (don’t send 10 alerts at once — batch into a digest), deduplication, and delivery tracking.
- Price History Service: Serves price graphs and trend analysis. Queries TimescaleDB with pre-computed daily aggregates.
- API Gateway: Handles alert CRUD, price queries, user authentication.
Route Consolidation Example
Alert 1: JFK→LAX, Jun 1-7, economy
Alert 2: JFK→LAX, Jun 3-10, economy
Alert 3: JFK→LAX, Jun 1-7, business
Monitored routes:
Route A: JFK→LAX, Jun 1-10, economy (covers alerts 1 + 2)
Route B: JFK→LAX, Jun 1-7, business (covers alert 3)
This reduces 100M alerts to ~500K monitored routes — a 200x reduction in API calls.
6. Deep Dives (15 min)
Deep Dive 1: Efficient Price Polling and Cost Optimization
The core challenge: Airline APIs are rate-limited (e.g., 100 requests/sec per API key) and sometimes paid ($0.01-$0.10 per query). Naively checking 500K routes every 30 minutes costs ~$500K/year in API fees alone.
Adaptive polling frequency:
Priority tiers based on:
1. Alert count: routes with more alerts are more valuable
2. Travel date proximity: flights departing in 3 days change price more often than 6 months out
3. Historical volatility: routes with frequent price swings get checked more often
4. Time of day: airlines update prices more during business hours
Tier 1 (hot): >1000 alerts, travel <7 days → check every 15 min
Tier 2 (warm): >100 alerts, travel <30 days → check every 30 min
Tier 3 (cool): >10 alerts, travel <90 days → check every 2 hours
Tier 4 (cold): <10 alerts, travel >90 days → check every 6 hours
Staggering and jittering:
- Don’t check all Tier 1 routes at exactly :00, :15, :30, :45. Add random jitter (0-5 min) to spread load evenly.
- Round-robin across API providers to stay under each provider’s rate limit.
Caching and deduplication:
- If two routes overlap (JFK→LAX Jun 1-7 and JFK→LAX Jun 3-10), a single API call for JFK→LAX Jun 1-10 covers both. The Route Consolidator merges these.
- Cache API responses for 5 minutes. If a route check returns the same results, skip downstream processing.
Fallback data sources:
- Primary: airline direct APIs (most accurate)
- Secondary: aggregator APIs (Skyscanner, Kiwi — broader coverage, slightly delayed)
- Tertiary: scraping (last resort, fragile, use headless browser)
- Cross-validate prices across sources to catch stale data
Deep Dive 2: Change Detection and Alert Triggering
What counts as a “significant” price change?
Naive approach: alert if price < threshold. Problem: prices fluctuate by $5-10 constantly. Users get spammed.
Smart change detection:
Trigger notification if ANY of:
1. Price drops below user's max_price (explicit threshold)
2. Price drops > 10% from the 7-day moving average
3. Price reaches the lowest point in the last 30 days
4. Price drops > $50 from previous check (absolute threshold)
Suppress notification if:
1. Same alert was notified in the last 4 hours (debounce)
2. Price bounced back up within 30 minutes (transient glitch)
3. Only 1 seat at the low price (likely a data artifact)
Confirmation check: When a price drop is detected, re-check the price after 5 minutes before sending notifications. This catches transient pricing errors and “ghost fares” that disappear immediately.
Batching notifications:
If a user has 5 alerts and 3 trigger in the same hour:
→ Don't send 3 separate emails
→ Batch into a single digest email: "3 of your watched flights dropped in price"
→ Include a CTA to view all alerts in the app
Deep Dive 3: Scaling the Alert Matching
The fan-out problem: When a popular route (JFK→LAX) drops in price, 100K+ alerts may match. We need to find and notify all of them quickly.
Approach 1: Database query (simple, works to ~10M alerts)
SELECT alert_id, user_id, max_price, notification_channels
FROM alerts
WHERE route_key = 'JFK-LAX-2026-06-economy'
AND status = 'active'
AND (max_price IS NULL OR max_price >= 199.00);
With an index on (route_key, status), this query returns 100K rows in ~50ms.
Approach 2: Pre-built alert index in Redis (for 100M+ alerts)
Key: alerts:route:JFK-LAX-2026-06-economy
Type: Sorted Set
Score: max_price (or 0 for "any drop" alerts)
Members: alert_id
On price change to $199:
ZRANGEBYSCORE key 199 +inf → all alerts with max_price >= $199
This is O(log N + M) where M is matching alerts — much faster for large cardinalities.
Processing 100K matched alerts:
- Don’t process sequentially. Publish to a notification queue (SQS/Kafka) with high parallelism.
- 100K notifications at 5ms each (template rendering + API call) = 500 seconds sequentially. With 100 workers = 5 seconds.
- Use batch APIs where possible (SES supports 50 emails per batch call).
7. Extensions (2 min)
- Price prediction: Use historical data to train a model that predicts whether prices will go up or down. Show “buy now” vs “wait” recommendations with confidence scores.
- Flexible date search: Instead of a fixed date range, alert users about the cheapest day to fly in a given month. Requires checking many more date combinations — use sampling.
- Multi-leg trips: Support complex itineraries (JFK→LAX→SFO→JFK). Each leg is independently monitored but the alert triggers on total trip price.
- Fare class tracking: Track not just price but fare class (refundable, basic economy, etc.). Alert when a better fare class becomes available at the same price.
- Social proof: “23,000 people are tracking this route” or “prices usually drop 3 weeks before departure for this route” — aggregated insights from the alert data.