Flash Sale / Ticketmaster

High-concurrency inventory systems for managing traffic spikes.

Last modified on March 12, 2026

Problem Statement & Constraints

Design a ticketing or flash sale system capable of handling millions of users simultaneously trying to purchase a limited set of items (e.g., concert tickets). The system must prevent over-selling, ensure fair access (e.g., virtual waiting rooms), and maintain stable performance during extreme traffic bursts.

Functional Requirements

Browse inventory and product details.
Reserve items with time-limited holds.
Complete purchases with payment processing.
Manage a virtual waiting room for fair queue management.

Non-Functional Requirements

Scale: Handle 1M concurrent users; limited inventory (e.g., 100k tickets).
Availability: 99.99% uptime for sale event duration.
Consistency: Linearizable consistency for inventory counts; no double-selling.
Latency: Inventory check and reservation < 500ms under peak load.
Workload Profile:
- Read:Write ratio: ~80:20
- Peak throughput: 1M requests/sec
- Retention: 30 days post-event

High-Level Architecture

graph TD Users --> Edge Edge --> WaitRoom[Wait Room] WaitRoom --> GW[Gateway] GW --> Reserve Reserve --> Redis[(Redis)] Reserve --> Orders[(Orders)] GW --> Payment Payment --> Orders

The CDN/Edge layer buffers bursting traffic into a Virtual Waiting Room. Admitted users pass through an API Gateway to a Reservation service that atomically claims inventory in Redis and writes a temporary hold to the sharded Orders DB. Checkout via the Payment service converts holds into confirmed orders.

Data Design

Redis provides a high-speed volatile cache for atomic inventory counters and short-lived idempotency keys. The SQL Orders database provides durable transactional truth, utilizing optimistic concurrency control to handle write bursts safely.

Inventory Key-Space (Redis)

Key Pattern	Value Type	Description	TTL
`inv:<sku_id>`	Integer	Atomic counter for available items.	Event duration
`hold:<sku_id>:<user_id>`	String	Reservation lock / owner ID.	10 minutes
`idemp:<key>`	String	Idempotency key for deduplication.	15 minutes

Order Schema (SQL)

Table	Column	Type	Description
orders	`id`	UUID (PK)	Unique order identifier.
	`user_id`	UUID (FK)	Buyer identifier.
	`sku_id`	String	Purchased item ID.
	`status`	Enum	`pending`, `confirmed`, `expired`.
	`version`	Integer	For optimistic concurrency control.

Deep Dive & Trade-offs

import redis

# Connect to Redis cluster
r = redis.Redis(host='localhost', port=6379, db=0)

# Lua script to check inventory and deduct atomically
# ARGV[1] = requested amount
LUA_RESERVE_SCRIPT = """
local inventory_key = KEYS[1]
local req_amount = tonumber(ARGV[1])

local current = tonumber(redis.call('GET', inventory_key) or '0')

if current >= req_amount then
    redis.call('DECRBY', inventory_key, req_amount)
    return 1 -- Success
else
    return 0 -- Failed: Not enough inventory
end
"""

# Register script to avoid parsing it on every call
reserve_inventory = r.register_script(LUA_RESERVE_SCRIPT)

def try_reserve(sku_id, amount_needed):
    # Executes atomically: guarantees no race condition between GET and DECR
    success = reserve_inventory(keys=[f"inv:{sku_id}"], args=[amount_needed])

    if success == 1:
        # Proceed with reservation hold logic (Phase 1)
        return create_hold(sku_id, amount_needed)
    else:
        raise Exception("Sold out or not enough inventory")

Deep Dive

Virtual waiting room: Token-bucket batch admission smooths the thundering herd, redirecting excess traffic to static polling pages.
Atomic inventory: Redis Lua scripts safely DECR counters only if ≥ 0, achieving high concurrency without SQL row-level locks.
Two-phase purchase: Phase 1 reserves inventory with a TTL; Phase 2 confirms on payment. A background reaper recycles expired holds.
Edge load shedding: Edge IP rate limits and API gateway backend-concurrency shedding (503 Retry-After) protect the core.
Sharded Order DB: Horizontal sharding by order_id combined with optimistic concurrency control distributes the massive write-burst.
Idempotent requests: Short-lived Redis deduplication of client-generated keys prevents double-charges from network retries.
Bot mitigation: Waiting-room CAPTCHAs, device fingerprinting, and behavioral analysis block automated scalpers.

Trade-offs

Redis vs. DB Locks: Redis is faster for hot counters but riskier on state loss; DB locks are durable but create massive contention bottlenecks under peak load.
Waiting Room vs. Rate Limiting: Waiting rooms provide a fair experience but add latency; pure rate limiting is simpler but results in random, frustrating rejections.
Reservation TTL Length: Short TTLs recycle inventory faster but risk timeouts; long TTLs are user-friendly but can keep inventory “hostage” if users abandon.

Operational Excellence

SLIs / SLOs

SLO: 99.9% of admitted users can complete a reservation within 500 ms.
SLO: 0% over-sell rate (linearizable inventory accuracy).
SLIs: reservation_latency_p99, inventory_accuracy (Redis vs. Order DB reconciliation), waiting_room_admission_rate, payment_success_rate, abandoned_reservation_rate.

Reliability & Resiliency

Load: Test at 2x peak (2M users) in staging before each event.
Chaos: Kill Redis primary and verify Sentinel failover without data loss.
Reaper: End-to-end test TTL reaper to ensure hold recycling.

Service Resilience tag
Design patterns for reliable microservice behavior under load; implementing strict request idempotency, non-blocking async I/O, robust circuit breakers, durable background queues, and observability.
Distributed Web Crawler tag category
A highly resilient architectural design for a Google-scale web crawler; heavily focusing on breadth-first search (BFS), extensive DNS resolution caching, and polite handling of malicious domains.
URL Shortener & Pastebin tag category
A robust structural design for a highly available, extremely read-heavy service bridging short aliases to long URLs; implementing Base62 encoding, Snowflake IDs, and strict collision avoidance.
Real-Time Collaborative WebApp tag category
A real-time synchronization design for collaborative applications (e.g., Google Docs, Figma); utilizing WebSockets and Operational Transformation (OT) or CRDTs for consistent state resolution.