You’re building a notification bell. The product team wants it to feel “real-time.” You reach for WebSockets because that’s what every blog post tells you. Six months later, you’re debugging a connection manager that handles reconnections, heartbeats, load balancer stickiness, and auth token refresh - all for a feature where a 5-second delay would have been perfectly fine.
The problem isn’t picking the wrong tool. It’s not understanding the tradeoffs before picking.
Short Polling
The simplest approach. Your client asks the server for updates on a fixed interval.
setInterval(async () => {
const res = await fetch('/api/notifications');
const data = await res.json();
updateUI(data);
}, 5000); // every 5 seconds
The server processes each request independently. No state, no open connections, no special infrastructure.
What happens under the hood:
- Client sends HTTP request
- Server checks for new data
- Server responds immediately (even if nothing changed)
- Client waits, then repeats
When it works well:
- Dashboards that refresh every 30-60 seconds
- Checking the status of a background job
- Weather or stock tickers with acceptable staleness
- Low user counts where wasted requests don’t matter
The cost you pay:
Most responses return nothing new. If you have 10,000 users polling every 5 seconds, that’s 2,000 requests/second hitting your server - and 90%+ of them get back an empty response. You’re paying for compute, bandwidth, and database reads that produce no value.
| Users | Interval | Requests/sec | % Useful (estimate) |
|---|---|---|---|
| 1,000 | 5s | 200 | ~5% |
| 10,000 | 5s | 2,000 | ~5% |
| 100,000 | 5s | 20,000 | ~5% |
| 100,000 | 30s | 3,333 | ~15% |
Increasing the interval reduces load but increases latency. That’s the fundamental tension with polling.
Long Polling
Long polling flips the model. Instead of the server responding immediately with “nothing new,” it holds the connection open until there’s actually something to send.
async function subscribe() {
try {
const res = await fetch('/api/notifications/subscribe');
const data = await res.json();
updateUI(data);
} catch (err) {
// network error - wait briefly, then retry
await new Promise(r => setTimeout(r, 1000));
}
// immediately reconnect for next update
subscribe();
}
subscribe();
On the server side (Node.js example):
app.get('/api/notifications/subscribe', async (req, res) => {
const userId = req.user.id;
// wait up to 30 seconds for new data
const data = await waitForUpdate(userId, { timeout: 30000 });
if (data) {
res.json(data);
} else {
// timeout - respond empty so client reconnects
res.status(204).end();
}
});
What happens under the hood:
- Client sends HTTP request
- Server holds the connection open
- When new data exists (or timeout hits), server responds
- Client immediately sends a new request
Why this is better than short polling:
The server only responds when there’s something worth sending. No wasted empty responses. The client gets updates almost instantly - latency is limited only by how fast the server detects the change and flushes the response.
The catch:
Each held connection consumes a server thread or socket. With 10,000 users, you have 10,000 open HTTP connections sitting idle. Traditional thread-per-request servers (older Java, PHP) struggle here. Event-loop servers (Node.js, Go, Nginx) handle it better, but it’s still a resource you’re holding.
Load balancers also need attention. If a long-poll request is sitting open for 30 seconds, your LB timeout must be longer than that. And if a server restarts, every connected client reconnects simultaneously - a thundering herd problem.
When it works well:
- Chat applications at moderate scale
- Notification systems where near-instant delivery matters
- Collaborative editing (Google Docs used this before switching to WebSockets)
- When you can’t use WebSockets (corporate proxies, restrictive firewalls)
WebSockets
WebSockets establish a persistent, full-duplex connection between client and server. Both sides can send messages at any time without the overhead of HTTP headers on every exchange.
const ws = new WebSocket('wss://api.example.com/ws');
ws.onopen = () => {
ws.send(JSON.stringify({ type: 'subscribe', channel: 'notifications' }));
};
ws.onmessage = (event) => {
const data = JSON.parse(event.data);
updateUI(data);
};
ws.onclose = () => {
// reconnect after delay
setTimeout(() => connectWebSocket(), 2000);
};
Server side (Node.js with ws):
const WebSocket = require('ws');
const wss = new WebSocket.Server({ port: 8080 });
wss.on('connection', (ws, req) => {
const userId = authenticate(req);
ws.on('message', (msg) => {
const parsed = JSON.parse(msg);
if (parsed.type === 'subscribe') {
addToChannel(userId, parsed.channel, ws);
}
});
ws.on('close', () => {
removeFromAllChannels(userId);
});
});
// broadcast to all subscribers of a channel
function broadcast(channel, data) {
const subscribers = getChannelSubscribers(channel);
const payload = JSON.stringify(data);
subscribers.forEach(ws => {
if (ws.readyState === WebSocket.OPEN) {
ws.send(payload);
}
});
}
What happens under the hood:
- Client sends HTTP upgrade request
- Server accepts, connection upgrades from HTTP to WebSocket
- Both sides can send messages freely over TCP
- Connection stays open until either side closes it
Where WebSockets shine:
- Multiplayer games (every millisecond matters)
- Live trading platforms
- Chat at scale (Slack, Discord)
- Collaborative real-time editing
- Live sports scores or auction bidding
The complexity you take on:
WebSockets are stateful. That changes everything.
Connection management. You need heartbeats (ping/pong) to detect dead connections. You need reconnection logic with exponential backoff. You need to handle auth token expiry on long-lived connections.
Scaling. If User A is connected to Server 1 and User B is connected to Server 2, how does A’s message reach B? You need a pub/sub layer - Redis, Kafka, or NATS - sitting behind your WebSocket servers to fan out messages.
Client A ↔ WS Server 1 ↔ Redis Pub/Sub ↔ WS Server 2 ↔ Client B
Load balancing. Sticky sessions or a separate connection routing layer. Standard round-robin HTTP load balancing doesn’t work because connections are persistent.
Monitoring. HTTP gives you request/response metrics for free. With WebSockets, you need custom instrumentation - connection counts, message rates, error rates, latency per message type.
The Comparison
| Factor | Short Polling | Long Polling | WebSockets |
|---|---|---|---|
| Latency | High (interval-dependent) | Low (~instant) | Lowest (true real-time) |
| Server load | High (wasted requests) | Medium (held connections) | Low (per-message) |
| Complexity | Trivial | Moderate | High |
| Scaling difficulty | Easy (stateless) | Medium | Hard (stateful) |
| Bidirectional | No (client → server only) | No (server → client, sort of) | Yes |
| Firewall/proxy friendly | Yes | Mostly | Sometimes no |
| Connection overhead | New TCP + TLS per request | New TCP + TLS per timeout | One TCP + TLS, then frames |
| Browser support | Universal | Universal | Universal (modern) |
A Decision Framework
Start with three questions:
1. How fresh does the data need to be?
If 30-second staleness is acceptable, short polling is your answer. Ship it and move on. Don’t over-engineer.
2. Is communication one-way or two-way?
If only the server needs to push updates to the client (notifications, live scores, dashboards), long polling or Server-Sent Events (SSE) are simpler than WebSockets. You don’t need a bidirectional channel for a one-way data flow.
3. How many concurrent users and how frequent are updates?
| Scenario | Recommendation |
|---|---|
| < 1K users, updates every 30s+ | Short polling |
| 1K-50K users, near-instant updates needed | Long polling or SSE |
| 50K+ users, frequent bidirectional messages | WebSockets |
| Multiplayer game, live collaboration | WebSockets (no alternative) |
| Background job status check | Short polling |
| Chat system at scale | WebSockets |
| Notification bell on a dashboard | Long polling |
Real-World Examples
Uber driver location updates: WebSockets. Riders see driver movement in real-time, and both rider and driver apps send frequent location pings. Bidirectional, high frequency, latency-sensitive.
GitHub Actions build status: Short polling. The UI polls the API every few seconds to check if your build passed. Acceptable latency, simple implementation, stateless.
Slack messages: WebSockets. Real-time messaging across channels, typing indicators, presence status - all require persistent bidirectional connections with pub/sub fan-out.
Email inbox count (like Gmail’s unread badge): Long polling. You want near-instant updates when a new email arrives, but communication is one-way. Gmail actually uses a custom long-polling mechanism they call “BOSH.”
Stock price on a portfolio dashboard: SSE or long polling. One-way server push, moderate frequency. WebSockets would work but add unnecessary complexity.
The One Mistake Everyone Makes
Choosing WebSockets because it sounds impressive in a system design interview. The interviewer isn’t testing whether you know WebSockets exist. They’re testing whether you can evaluate tradeoffs.
Starting with short polling and upgrading when you hit its limits is almost always the right call in production. Twitter ran on polling for years. Facebook’s chat started with long polling. They moved to WebSockets when - and only when - scale demanded it.
The best architecture is the one that solves today’s problem without creating tomorrow’s operational nightmare.
Bottom Line
Short polling is dumb but reliable. Long polling is clever and efficient for one-way push. WebSockets are powerful but operationally expensive. Match the tool to the actual latency and scale requirements - not to what feels most “real-time.” Most features that feel like they need WebSockets work perfectly fine with long polling or even a 5-second poll.
Comments