You spin up a single Redis instance, throw your session data in it, and everything works great. Then your app grows. One day Redis goes down for 30 seconds during a deploy, and every user gets logged out. Your manager asks: “Why don’t we have high availability?” You Google “Redis HA” and find Sentinel, Cluster, sharding, replication, pipelining - and suddenly a key-value store feels as complex as a distributed database.
It doesn’t have to be confusing. Let’s break it down.
Sequential Commands - The Baseline
Every Redis command follows a simple request-response cycle over TCP:
Client: SET user:1 "alice" --> Server processes --> Client reads +OK
Client: SET user:2 "bob" --> Server processes --> Client reads +OK
Client: SET user:3 "charlie" --> Server processes --> Client reads +OK
Each command waits for the previous one to complete. The problem isn’t Redis processing speed - a single Redis instance handles a command in ~0.1ms. The problem is network round-trip time (RTT).
On a local network with 0.5ms RTT, sending 1,000 sequential commands takes:
1,000 commands x 0.5ms RTT = 500ms minimum
That’s half a second just waiting for the network. Cross-AZ (1-2ms RTT), that’s 1-2 seconds. Cross-region (50ms RTT), that’s 50 seconds for 1,000 simple SETs.
Typical sequential throughput: 5,000-10,000 ops/sec over a local network. Not because Redis is slow - because TCP round trips are expensive. For how Redis evolved past this single-threaded bottleneck, see how Redis went from single-threaded to 3.5M ops/sec.
Pipelining - Batching Commands
Pipelining sends multiple commands without waiting for individual responses. The server buffers all replies and sends them back together. The Redis pipelining documentation covers the protocol details.
Client: SET user:1 "alice" \r\n SET user:2 "bob" \r\n SET user:3 "charlie" \r\n -->
Server: +OK \r\n +OK \r\n +OK \r\n <--
One round trip instead of three. The speedup scales with the number of commands:
| Batch Size | Throughput (ops/sec) | Speedup vs Sequential |
|---|---|---|
| 1 (sequential) | ~5,000 | 1x |
| 10 | ~50,000 | 10x |
| 50 | ~80,000-100,000 | 16-20x |
| 100 | ~100,000+ | 20x |
| 500 | ~120,000-150,000 | 24-30x |
| 1,000 | ~150,000+ | 30x |
Real benchmark numbers (redis-benchmark):
| Mode | SET ops/sec | GET ops/sec |
|---|---|---|
| No pipelining | ~95,000 | ~97,000 |
| Pipeline (16 commands) | ~877,000 | ~1,350,000 |
That’s a 9-14x improvement just by batching commands.
Why Pipelining Helps Even on Localhost
You might think: “If RTT is near zero on localhost, pipelining shouldn’t matter.” But it still gives a 2-5x boost. The reason is system call overhead. Without pipelining, each command requires separate read() and write() syscalls, each causing a user-space to kernel-space context switch. With pipelining, many commands are read in a single read() and all replies go out in a single write().
The Latency Multiplier
The higher your network latency, the bigger the pipelining win:
| Network | 1,000 Sequential Commands | 1,000 Pipelined Commands | Speedup |
|---|---|---|---|
| Loopback (<0.1ms) | ~150ms | ~30ms | 5x |
| Local network (0.5ms) | ~500ms | ~15ms | 33x |
| Cross-AZ (1-2ms) | ~1,500ms | ~15ms | 100x |
| Cross-region (50ms) | ~50,000ms | ~100ms | 500x |
A real-world case: a data migration script projected to take 50+ hours running sequentially from London to a US Redis instance completed in under 4 minutes with pipelining.
Pipelining in Code
Python (redis-py):
import redis
r = redis.Redis()
# Pure pipeline - no atomicity, maximum speed
pipe = r.pipeline(transaction=False)
for i in range(1000):
pipe.set(f"key:{i}", f"value:{i}")
results = pipe.execute() # returns [True, True, ...]
Node.js (ioredis):
const Redis = require("ioredis");
const redis = new Redis();
const pipeline = redis.pipeline();
pipeline.set("key1", "val1");
pipeline.set("key2", "val2");
pipeline.get("key1");
const results = await pipeline.exec();
// [[null, "OK"], [null, "OK"], [null, "val1"]]
Go (go-redis):
pipe := rdb.Pipeline()
pipe.Set(ctx, "key1", "val1", 0)
pipe.Set(ctx, "key2", "val2", 0)
cmds, err := pipe.Exec(ctx)
Pipelining is NOT Atomic
This is the most common mistake. Two clients sending pipelines simultaneously will have their commands interleaved on the server. If you need atomicity, use MULTI/EXEC transactions. You can combine both - pipeline a MULTI/EXEC block for atomicity with network efficiency.
| Feature | Pipeline | MULTI/EXEC | Pipeline + MULTI/EXEC |
|---|---|---|---|
| Batches network calls | Yes | No | Yes |
| Atomic execution | No | Yes | Yes |
| Isolated from other clients | No | Yes | Yes |
| Performance | Fastest | Slower (extra round trips) | Fast + atomic |
When Pipelining Hurts
- Memory pressure: The server buffers all responses in memory until the entire pipeline is flushed. Pipelining 1 million commands returning 1KB each = ~1GB buffered on the server. Batch in chunks of 1,000-10,000
- Blocking commands:
BLPOP,BRPOP,SUBSCRIBEinside a pipeline will block the connection and stall everything after it - Dependent commands: If command B needs the result of command A, they can’t be pipelined together. Use Lua scripts for read-compute-write patterns
- Large responses: Pipelining 1,000
HGETALLcommands on fat hashes can spike memory on both client and server
Replication - The Foundation of HA
Before understanding Sentinel or Cluster, you need to understand replication. Redis uses asynchronous replication - the master acknowledges writes immediately, then propagates to replicas in the background.
Client --> Master (write acknowledged) --> Replica 1 (async)
--> Replica 2 (async)
What this means: If the master crashes right after acknowledging a write but before replicating it, that write is lost. This is a deliberate tradeoff - synchronous replication would kill Redis’s low-latency advantage.
You can use WAIT for critical writes:
SET critical-key "important-value"
WAIT 1 5000 # wait for at least 1 replica to acknowledge, timeout 5s
But this doesn’t make writes durable - it just reduces the window for data loss.
Sentinel - High Availability Without Sharding
Sentinel solves one problem: automatic failover. If your master dies, Sentinel promotes a replica to master so your application keeps running.
How It Works
Sentinel is a separate process (default port 26379) that monitors your Redis instances. You run at least 3 Sentinel instances on separate machines.
+----------+ +----------+ +----------+
| Sentinel | | Sentinel | | Sentinel |
| S1 | | S2 | | S3 |
+----------+ +----------+ +----------+
| | |
+-------+-------+-------+-------+
| |
+---------+ +---------+
| Master |---->| Replica |
| M1 | | R1 |
+---------+ +---------+
The failover process:
- SDOWN (Subjective Down): One Sentinel can’t reach the master for
down-after-milliseconds(default: 30 seconds, commonly tuned to 3-5 seconds) - ODOWN (Objective Down):
quorumnumber of Sentinels agree the master is unreachable - Leader election: Sentinels elect one leader to execute the failover
- Promotion: Leader sends
REPLICAOF NO ONEto the best replica - Reconfiguration: Other replicas are pointed to the new master
How replica selection works: Skip disconnected replicas, then pick by replica-priority (lower wins), then by replication offset (most data wins), then by run ID as tiebreaker.
Failover Timing
| Phase | Duration |
|---|---|
| Detection (SDOWN) | down-after-milliseconds (tuned: 3-5s) |
| Agreement (ODOWN) | 1-2 seconds |
| Leader election + promotion | < 1 second |
| Total (default config) | 30-35 seconds |
| Total (tuned production) | 5-15 seconds |
The Split-Brain Problem
Network partition isolates the old master. Sentinels on the other side promote a replica. Now you have two masters accepting writes. When the partition heals, the old master is demoted and its writes during the partition are lost.
Mitigation:
min-replicas-to-write 1
min-replicas-max-lag 10
The master stops accepting writes if no replica has acknowledged replication within 10 seconds. This limits the data loss window but doesn’t eliminate it - async replication means some acknowledged writes may never reach replicas.
When to Use Sentinel
- Your dataset fits on a single machine (< 25-50GB)
- You need automatic failover but not horizontal write scaling
- You want read scaling via replicas
- Use cases: session stores, caching, rate limiting, queues at moderate scale
Sharding - Splitting Data Across Nodes
When your data outgrows a single machine’s RAM, you shard - distribute data across multiple Redis instances. There are three approaches:
1. Client-Side Sharding
Your application calculates which Redis node holds each key:
import hashlib
nodes = ["redis-1:6379", "redis-2:6379", "redis-3:6379"]
def get_node(key):
hash_val = int(hashlib.md5(key.encode()).hexdigest(), 16)
return nodes[hash_val % len(nodes)]
Pros: No extra infrastructure, no proxy overhead. Cons: Adding or removing nodes requires rehashing and migrating data manually. No automatic failover. Every client must know the full topology.
Use consistent hashing (hash ring) instead of simple modulo to minimize key redistribution when nodes change.
2. Proxy-Based Sharding (Twemproxy, Codis)
A proxy sits between your app and Redis nodes, routing commands to the correct shard.
App --> Twemproxy --> Redis 1
--> Redis 2
--> Redis 3
Pros: Existing clients work without changes. Cons: Extra network hop (latency), proxy is another component to manage and scale, largely considered legacy - Redis Cluster has replaced this for most use cases.
3. Redis Cluster (Recommended)
Redis’s built-in sharding. The cluster itself handles data distribution and routing.
Redis Cluster - Sharding + HA Built In
Redis Cluster combines sharding and high availability in one package. It’s the standard for production Redis at scale.
Hash Slots
The keyspace is divided into 16,384 hash slots. Each key maps to a slot:
HASH_SLOT = CRC16(key) mod 16384
Each master node owns a range of slots. A 3-master cluster:
Master 1: slots 0-5460
Master 2: slots 5461-10922
Master 3: slots 10923-16383
Each master has one or more replicas for failover. Minimum production setup: 6 nodes (3 masters + 3 replicas).
Master 1 (slots 0-5460) <--> Replica 1
Master 2 (slots 5461-10922) <--> Replica 2
Master 3 (slots 10923-16383) <--> Replica 3
How Client Routing Works
When a client sends a command to the wrong node, the node returns a redirect:
MOVED (permanent - slot belongs to another node):
> GET user:alice
-MOVED 12182 192.168.1.3:6379
The client updates its local slot map and retries. Smart clients (ioredis, Jedis, Lettuce) cache the slot-to-node mapping so this only happens occasionally.
ASK (temporary - slot is being migrated):
> GET user:bob
-ASK 8901 192.168.1.2:6379
Client sends the command to the specified node with ASKING prefix, but does NOT update its slot map - the migration is still in progress.
Hash Tags - Controlling Slot Assignment
Multi-key operations (MGET, MSET, MULTI/EXEC) only work when all keys hash to the same slot. Use hash tags to force this:
{user:1000}.profile --> CRC16("user:1000") mod 16384
{user:1000}.session --> CRC16("user:1000") mod 16384
{user:1000}.preferences --> CRC16("user:1000") mod 16384
Only the part inside {...} is hashed. All three keys land on the same slot, so multi-key operations work.
The trap: Over-using hash tags creates hot slots. If {user:1000} has 10 million keys, one node holds all of them while others sit empty. Only use hash tags when you genuinely need multi-key atomicity on those specific keys.
Cross-Slot Limitations
These fail in Cluster mode if keys are on different slots:
MGET key1 key2 key3 # CROSSSLOT error
MULTI ... key1 ... key2 # CROSSSLOT error
SUNION set1 set2 # CROSSSLOT error
Lua script accessing keys # CROSSSLOT error
on different slots
This is the biggest operational difference from standalone Redis. Design your key schema with this in mind from day one.
Resharding (Zero Downtime)
Adding a node to the cluster means migrating some hash slots to it:
# Add a new empty node
redis-cli --cluster add-node new-node:6379 existing-node:6379
# Reshard 1000 slots to the new node
redis-cli --cluster reshard existing-node:6379 \
--cluster-from <source-id> \
--cluster-to <new-node-id> \
--cluster-slots 1000 \
--cluster-yes
During migration, existing keys on the source are served normally. Missing keys are redirected to the destination with ASK. No downtime.
Watch out for large keys: The MIGRATE command that moves keys between nodes blocks for the duration of the transfer. A single 100MB key can block the slot for seconds.
Cluster Failover Timing
| Scenario | Duration |
|---|---|
Manual failover (CLUSTER FAILOVER) |
< 1 second |
| Automatic (node failure, default config) | 15-20 seconds |
| Automatic (tuned production) | 5-10 seconds |
Automatic failover formula:
delay = 500ms + random(0-500ms) + REPLICA_RANK * 1000ms
The most up-to-date replica (rank 0) goes first.
Sentinel vs Cluster - When to Use What
| Factor | Sentinel | Cluster |
|---|---|---|
| Data fits in single node | Yes | Overkill |
| Data exceeds single node | Can’t help | Yes |
| Write scaling needed | No | Yes |
| Multi-key operations | Unrestricted | Same-slot only |
| Operational complexity | Low-medium | Medium-high |
| Client library requirements | Any client | Cluster-aware client |
| Multiple databases (SELECT) | Yes | No (db 0 only) |
| Lua scripts | Unrestricted | Same-slot keys only |
| Min production nodes | 5 (3 Sentinel + 1 master + 1 replica) | 6 (3 masters + 3 replicas) |
| Failover time (tuned) | 5-15 seconds | 5-10 seconds |
The natural progression:
Standalone --> Standalone + Replicas --> Sentinel --> Cluster
Most applications live comfortably in the first two stages forever. Don’t jump to Cluster because it sounds impressive.
Common Mistakes
1. Running only 2 Sentinels. With 2 Sentinels, a network partition means neither side has a majority. No failover happens, or worse, both sides try to promote - split brain.
2. Forgetting cluster-require-full-coverage. Default is yes - if ANY hash slot becomes uncovered (master down, no replica available), the ENTIRE cluster stops accepting writes. Set to no if partial availability is acceptable.
3. Pipeline without error handling. If one command in a pipeline fails, others still execute. Treating the whole pipeline as failed (which is what if err != nil does in Go) caused a 95% failure rate at Grab when a single replica went down.
4. Using blocking commands in pipelines. BLPOP, SUBSCRIBE inside a pipeline blocks the connection and stalls everything after it.
5. Hot hash tags. Putting everything under {shared} concentrates all data on one node. Use high-cardinality tags like {user:12345}.
6. Not opening the cluster bus port. Cluster needs both the client port (6379) AND the cluster bus port (16379). Forgetting the bus port in firewall rules makes nodes invisible to each other.
7. Ignoring min-replicas-to-write in Sentinel. Without this, a partitioned master keeps accepting writes that will be lost when it’s demoted. Always set it.
Practical Architecture Recommendations
< 10GB, simple caching/sessions: Standalone Redis + 1 replica. No Sentinel needed if brief downtime during manual failover is acceptable.
< 25GB, needs automatic failover: Standalone + 3 Sentinels + 1-2 replicas. Simple, well-understood, handles most use cases.
25-100GB or high write throughput: Redis Cluster with 3 masters + 3 replicas. Design key schema with hash tags from the start.
100GB+, multi-region: Redis Cluster with more shards. Consider managed services (ElastiCache, Azure Cache) to avoid operational headaches. Keep data per shard under 25GB for fast failover and sync. See the database sharding guide for thresholds across all databases.
Bottom Line
Sequential commands are fine for low-volume operations. Pipeline when you’re doing more than a handful of commands at once - the 10-100x speedup is free performance. Use Sentinel when you need automatic failover on a single-node dataset. Use Cluster when your data outgrows one machine or you need write scaling. And whatever you do, design your key schema before picking your architecture - cross-slot errors in production are not fun to debug.
Comments