1. Requirements & Scope (5 min)

Functional Requirements

  1. Transfer money between accounts within the same bank (internal) and across banks (external via SWIFT/ACH)
  2. Every transaction must follow double-entry bookkeeping — debit one account, credit another, with a full audit trail
  3. Support idempotent transfers — retrying the same request must not duplicate the transfer
  4. Provide real-time transaction status tracking (pending, processing, completed, failed, reversed)
  5. Enforce compliance checks (AML/KYC screening, sanctions list) before processing any transfer

Non-Functional Requirements

  • Availability: 99.99% — financial systems cannot afford extended downtime. Scheduled maintenance windows are acceptable (e.g., 2am-3am) but unplanned outages are catastrophic.
  • Latency: Internal transfers < 500ms end-to-end. External transfers (cross-bank) may take seconds to hours depending on the rail (ACH = batch, SWIFT = near-real-time).
  • Consistency: Strong consistency is mandatory. Money cannot be created or destroyed. Every debit must have a matching credit. We sacrifice availability for consistency (CP system).
  • Scale: 10,000 transfers/sec at peak. $50B daily volume. 500M accounts.
  • Durability: Zero data loss. Every transaction must be persisted to durable storage with replication before acknowledgment.

2. Estimation (3 min)

Traffic

  • 10,000 transfers/sec peak, ~3,000 avg
  • Each transfer involves: 1 write to create the transfer, 2 writes to update account balances (debit + credit), 1 write to the ledger
  • ~40,000 DB writes/sec peak
  • Read traffic (balance checks, transaction history): ~50,000 reads/sec

Storage

  • 500M accounts × 500 bytes (account metadata + balance) = 250 GB for accounts
  • 300M transfers/day × 365 days × 1 KB per transfer = ~110 TB/year for transaction history
  • Ledger entries: 600M/day (2 per transfer) × 500 bytes = ~110 TB/year
  • Total: ~250 TB/year growing. Need partitioning and archival strategy.

Money Math

  • All monetary amounts stored as integers in the smallest currency unit (cents for USD, pence for GBP)
  • Never use floating point. $100.50 is stored as 10050 cents.
  • Maximum transfer size: 64-bit integer → $92 quadrillion in cents. More than enough.

3. API Design (3 min)

// Initiate a transfer
POST /v1/transfers
  Headers: Idempotency-Key: "uuid-abc-123"
  Body: {
    "from_account_id": "acc_sender_001",
    "to_account_id": "acc_receiver_002",
    "amount": 10050,                    // $100.50 in cents
    "currency": "USD",
    "reference": "Invoice #4521",
    "transfer_type": "internal"         // or "ach", "swift"
  }
  Response 201: {
    "transfer_id": "txn_xyz_789",
    "status": "pending",
    "created_at": "2026-02-22T10:00:00Z"
  }

// Get transfer status
GET /v1/transfers/{transfer_id}
  Response 200: {
    "transfer_id": "txn_xyz_789",
    "status": "completed",              // pending | processing | completed | failed | reversed
    "from_account_id": "acc_sender_001",
    "to_account_id": "acc_receiver_002",
    "amount": 10050,
    "currency": "USD",
    "compliance_status": "cleared",
    "created_at": "2026-02-22T10:00:00Z",
    "completed_at": "2026-02-22T10:00:00Z"
  }

// Get account balance
GET /v1/accounts/{account_id}/balance
  Response 200: {
    "account_id": "acc_sender_001",
    "available_balance": 5000000,       // $50,000.00
    "pending_balance": 4989950,         // after pending debit
    "currency": "USD"
  }

// Get transaction history
GET /v1/accounts/{account_id}/transactions?limit=50&cursor=xxx

Key Decisions

  • Idempotency-Key header is mandatory on POST. The server stores the key and returns the same response on retry.
  • Amounts are always integers in the smallest currency unit. The API never accepts floats.
  • Transfer creation is asynchronous — returns pending immediately. Client polls or receives webhook for completion.

4. Data Model (3 min)

Accounts Table (PostgreSQL — sharded by account_id)

Table: accounts
  account_id        (PK)  | varchar(20)
  user_id           (FK)  | varchar(20)
  balance           | bigint          -- available balance in cents
  pending_balance   | bigint          -- balance after pending holds
  currency          | char(3)
  status            | enum('active', 'frozen', 'closed')
  created_at        | timestamp
  updated_at        | timestamp

Transfers Table (PostgreSQL — sharded by transfer_id)

Table: transfers
  transfer_id       (PK)  | varchar(20)
  idempotency_key   (UQ)  | varchar(64)
  from_account_id   (FK)  | varchar(20)
  to_account_id     (FK)  | varchar(20)
  amount            | bigint
  currency          | char(3)
  transfer_type     | enum('internal', 'ach', 'swift')
  status            | enum('pending', 'processing', 'completed', 'failed', 'reversed')
  compliance_status | enum('pending', 'cleared', 'flagged', 'blocked')
  reference         | varchar(200)
  created_at        | timestamp
  completed_at      | timestamp

Ledger Table (Append-Only — the source of truth)

Table: ledger_entries
  entry_id          (PK)  | bigint (auto-increment)
  transfer_id       (FK)  | varchar(20)
  account_id        (FK)  | varchar(20)
  entry_type        | enum('debit', 'credit')
  amount            | bigint
  balance_after     | bigint          -- running balance snapshot
  created_at        | timestamp

Idempotency Store (Redis + PostgreSQL)

Table: idempotency_keys
  idempotency_key   (PK)  | varchar(64)
  transfer_id       | varchar(20)
  response_body     | jsonb
  created_at        | timestamp
  expires_at        | timestamp       -- TTL: 24 hours

Why PostgreSQL?

  • ACID transactions are non-negotiable for financial systems
  • Row-level locking for concurrent balance updates
  • Serializable isolation level for critical transfer logic
  • Rich constraint system (CHECK balance >= 0, foreign keys)
  • Proven reliability in banking — this is not the place for eventual consistency

5. High-Level Design (12 min)

Transfer Flow (Internal)

Client
  → API Gateway (auth, rate limiting)
    → Transfer Service
      → 1. Validate idempotency key (Redis lookup)
         If exists → return cached response
      → 2. Create transfer record (status = pending)
      → 3. Compliance check (AML/KYC/sanctions screening)
         If flagged → status = blocked, notify compliance team
      → 4. Execute transfer (single DB transaction):
         BEGIN TRANSACTION (SERIALIZABLE)
           SELECT balance FROM accounts WHERE account_id = sender FOR UPDATE
           IF balance < amount → ROLLBACK, return insufficient funds
           UPDATE accounts SET balance = balance - amount WHERE account_id = sender
           UPDATE accounts SET balance = balance + amount WHERE account_id = receiver
           INSERT INTO ledger_entries (debit for sender)
           INSERT INTO ledger_entries (credit for receiver)
           UPDATE transfers SET status = 'completed'
         COMMIT
      → 5. Store idempotency key → response mapping
      → 6. Send notifications (async via message queue)

Transfer Flow (Cross-Bank via ACH/SWIFT)

Client
  → API Gateway → Transfer Service
    → 1-3. Same as above (validate, create, compliance)
    → 4. Debit sender's account + create hold
    → 5. Submit to Payment Rail:
         ACH: Batch file submitted to ACH operator (Nacha format)
              → Processed in batch windows (next business day)
         SWIFT: MT103 message to correspondent bank
              → Near-real-time via SWIFT network
    → 6. Await confirmation from external bank
         → Payment Rail Adapter (listens for responses)
           → On success: credit receiver's account, update transfer status
           → On failure: release hold, reverse debit, update status
    → 7. Reconciliation job validates all external transfers daily

Components

  1. API Gateway: Authentication, rate limiting, TLS termination. All traffic over mTLS internally.
  2. Transfer Service: Core business logic. Stateless, horizontally scaled. Orchestrates the transfer lifecycle.
  3. Compliance Service: Screens transfers against sanctions lists (OFAC, EU), runs AML rules (large amounts, velocity checks, geographic risk scoring). Calls external providers (e.g., Refinitiv, Dow Jones) for PEP/sanctions screening.
  4. Ledger Database (PostgreSQL): Sharded by account_id. Primary + synchronous replica for zero data loss. Append-only ledger table.
  5. Payment Rail Adapters: Separate services for ACH, SWIFT, FedWire. Handle protocol-specific formatting and communication.
  6. Notification Service: Sends emails, push notifications, webhooks on transfer completion/failure.
  7. Reconciliation Engine: Batch job that runs daily. Compares internal ledger against bank statements and external rail confirmations.
  8. Idempotency Store (Redis): Fast lookup for duplicate detection. Backed by PostgreSQL for durability.

6. Deep Dives (15 min)

Deep Dive 1: Double-Entry Bookkeeping & ACID Guarantees

Every financial movement must create exactly two ledger entries: a debit from one account and a credit to another. The sum of all debits must always equal the sum of all credits (the fundamental accounting equation).

Why this matters: If we debit Account A but crash before crediting Account B, money “disappears.” Double-entry bookkeeping with ACID transactions prevents this.

Implementation:

BEGIN TRANSACTION ISOLATION LEVEL SERIALIZABLE;

-- Lock sender's row to prevent concurrent modifications
SELECT balance FROM accounts WHERE account_id = 'sender' FOR UPDATE;

-- Check sufficient funds
-- (Application code: if balance < amount, ROLLBACK)

-- Debit sender
UPDATE accounts SET balance = balance - 10050 WHERE account_id = 'sender';

-- Credit receiver
UPDATE accounts SET balance = balance + 10050 WHERE account_id = 'receiver';

-- Create ledger entries (append-only, never modified)
INSERT INTO ledger_entries (transfer_id, account_id, entry_type, amount, balance_after)
VALUES ('txn_789', 'sender', 'debit', 10050, new_sender_balance);

INSERT INTO ledger_entries (transfer_id, account_id, entry_type, amount, balance_after)
VALUES ('txn_789', 'receiver', 'credit', 10050, new_receiver_balance);

UPDATE transfers SET status = 'completed', completed_at = NOW() WHERE transfer_id = 'txn_789';

COMMIT;

Key guarantee: The entire block succeeds or fails atomically. There is no state where money is debited but not credited.

The balance_after field stores a running balance snapshot at each ledger entry. This allows us to reconstruct any account’s balance at any point in time by reading a single row, rather than summing all entries.

Cross-shard problem: If sender and receiver are on different database shards, we cannot use a single DB transaction. Solution: Saga pattern (see Deep Dive 2).

Deep Dive 2: Distributed Transactions — The Saga Pattern

When sender and receiver live on different shards (or different banks), a single ACID transaction is impossible. We use the saga pattern — a sequence of local transactions with compensating actions on failure.

Transfer Saga:

Step 1: DEBIT — Debit sender's account (local transaction on Shard A)
  → On success: proceed to Step 2
  → Compensating action: credit sender's account back

Step 2: CREDIT — Credit receiver's account (local transaction on Shard B)
  → On success: mark transfer as completed
  → On failure: execute Step 1's compensating action (refund sender)

Orchestration vs. Choreography:

  • Orchestration (preferred): A central Transfer Saga Coordinator drives each step. It persists the saga state to a durable store. On crash recovery, it resumes from the last persisted step.
  • Choreography: Each service emits events and the next service listens. Harder to debug, harder to reason about failure states.

Implementation with saga state machine:

Table: saga_state
  saga_id         (PK) | varchar(20)
  transfer_id     (FK) | varchar(20)
  current_step    | enum('debit_pending', 'debit_done', 'credit_pending', 'credit_done', 'compensating', 'completed', 'failed')
  retry_count     | int
  last_updated    | timestamp

The coordinator polls for stale sagas (stuck in an intermediate state for > 30 seconds) and either retries or compensates. This handles crash recovery.

Key insight: Between Step 1 (debit) and Step 2 (credit), the money exists in a “in-transit” state. The sender’s balance is reduced, but the receiver hasn’t received it yet. The transfer record shows status = processing. This is analogous to real-world wire transfers where money is “in flight.”

Deep Dive 3: Idempotency & Retry Handling

Network failures are inevitable. A client may send a transfer request, the server processes it successfully, but the response is lost. The client retries — and without idempotency, the transfer executes twice.

Idempotency key flow:

1. Client generates UUID: "idem_abc_123"
2. POST /v1/transfers with Idempotency-Key: "idem_abc_123"
3. Server:
   a. Check Redis: EXISTS idem_abc_123
      → If found: return cached response (same status code, same body)
      → If not found: continue processing
   b. Process transfer
   c. Store in Redis: SET idem_abc_123 → {transfer_id, response_body} EX 86400
   d. Also persist to idempotency_keys table (durability)
   e. Return response to client

Race condition: Two identical requests arrive simultaneously.

Solution: Redis SET with NX (set-if-not-exists)
  SETNX idem_abc_123 "processing" EX 60
  → If returns 1 (set successfully): this is the first request, proceed
  → If returns 0 (already exists): this is a duplicate
    → If value = "processing": the first request is still in-flight, return 409 Conflict (retry later)
    → If value = {response}: return cached response

Idempotency key lifecycle:

  • Created when the first request arrives
  • Set to “processing” during execution
  • Updated with the actual response on completion
  • Expires after 24 hours (configurable)

Important: Idempotency keys must be scoped per user. User A and User B can coincidentally generate the same UUID — the actual key in storage is {user_id}:{idempotency_key}.

Deep Dive 4: Reconciliation & Fraud Detection

Reconciliation runs as a daily batch job:

1. Internal reconciliation:
   Sum of all debit ledger entries = Sum of all credit ledger entries (must be exact)
   For each account: balance = sum of all ledger entries for that account

2. External reconciliation (for cross-bank transfers):
   Compare internal transfer records against:
   - ACH return files (received next business day)
   - SWIFT confirmation messages (MT199/MT299)
   - Bank statement feeds (MT940/MT950)
   Flag any mismatches for manual review.

3. Generate reconciliation report:
   - Total volume, count, and net flow per currency
   - Unmatched transactions (internal vs. external)
   - Transactions stuck in intermediate states for > 24 hours

Fraud detection pipeline:

Every transfer passes through:
1. Rules engine (real-time):
   - Amount > $10,000 → flag for enhanced review (CTR filing)
   - Sender in high-risk geography → additional screening
   - Velocity check: > 5 transfers in 1 hour from same account → flag
   - Round-number detection: exactly $9,999 repeated (structuring)

2. ML model (near-real-time):
   - Features: amount, time of day, sender/receiver history, geographic distance,
     device fingerprint, typical transaction pattern
   - Model: gradient-boosted trees (fast inference < 10ms)
   - Output: fraud probability score 0-1
   - If score > 0.8: block and alert
   - If score 0.5-0.8: hold for manual review
   - If score < 0.5: allow

3. Network analysis (batch):
   - Build transaction graph
   - Detect money laundering rings (circular transfers)
   - Identify mule accounts (receive from many, send to few)

7. Extensions (2 min)

  • Multi-currency support: Store balances per currency per account. FX conversion at transfer time using a rate service with locked-in quotes (valid for 30 seconds). Track FX gain/loss in separate ledger entries.
  • Scheduled and recurring transfers: A scheduler service creates transfer requests at specified times. Uses cron-like scheduling with idempotent execution (same scheduled transfer on retry produces no duplicate).
  • Real-time notifications via webhooks: Partners register webhook URLs. On transfer status change, enqueue a webhook delivery job. Implement retry with exponential backoff (up to 72 hours). Include HMAC signature for verification.
  • Regulatory reporting automation: Auto-generate Suspicious Activity Reports (SARs), Currency Transaction Reports (CTRs), and SWIFT compliance reports. Feed data to compliance dashboard with audit trail.
  • Rate-based transfer limits: Per-account daily/weekly/monthly transfer limits. Configurable by account tier. Separate limits for internal vs. external transfers. Soft limits (warning) vs. hard limits (block).