E-Commerce Scalability: Handling 10x Traffic Spikes

Retail systems face unique scalability challenges—traffic can spike 10x or more during sales events, holidays, and flash promotions. At Interflora, Valentine's Day and Mother's Day meant preparing for traffic surges that dwarfed our baseline. Here's how we achieved 99.95% uptime during peak shopping periods.

Understanding Retail Traffic Patterns

The Reality of Spikes

Event	Traffic Multiplier	Duration
Flash sale announcement	5-10x	30-60 minutes
Holiday (Valentine's, Mother's Day)	8-15x	2-3 days
Black Friday/Cyber Monday	10-20x	4-5 days
TV advertisement	3-5x	15-30 minutes

The Cascade Effect

When one component slows, everything suffers:

Normal: User → CDN → App → DB → Response (200ms)

Under load:
User → CDN → App (waiting) → DB (saturated) → Timeout
              ↓
      Connection pool exhausted
              ↓
      New requests queued
              ↓
      Cascade failure

Capacity Planning

Baseline Measurement

Before you can plan for 10x, you need to know your 1x:

Key baseline metrics:
- Average requests per second (RPS)
- Peak RPS (daily, weekly patterns)
- Database queries per request
- Cache hit ratio
- Average response time by endpoint
- Error rate baseline

Capacity Model

Peak Planning Formula:
Required capacity = Baseline peak × Expected multiplier × Safety margin

Example:
- Normal peak: 500 RPS
- Black Friday multiplier: 15x
- Safety margin: 1.5x
- Required capacity: 500 × 15 × 1.5 = 11,250 RPS

Load Testing Strategy

# k6 load test script example
stages:
  - duration: '2m', target: 100    # Warm up
  - duration: '5m', target: 500    # Normal load
  - duration: '2m', target: 2500   # Ramp to 5x
  - duration: '5m', target: 2500   # Hold at 5x
  - duration: '2m', target: 5000   # Ramp to 10x
  - duration: '10m', target: 5000  # Hold at 10x
  - duration: '2m', target: 7500   # Push to 15x
  - duration: '5m', target: 7500   # Breaking point test

Caching Architecture

Multi-Layer Caching

Layer 1: CDN (Cloudflare/CloudFront)
├── Static assets (images, CSS, JS)
├── Product images
└── API responses (with proper cache headers)

Layer 2: Application Cache (Redis)
├── Session data
├── User cart state
├── Product catalog
└── Inventory counts (with short TTL)

Layer 3: Database Query Cache
├── Prepared statement cache
└── Query result cache

Cache-First Architecture

async function getProduct(productId: string): Promise<Product> {
  // Layer 1: Memory cache (hot items)
  const memCached = memoryCache.get(productId);
  if (memCached) return memCached;

  // Layer 2: Redis
  const redisCached = await redis.get(`product:${productId}`);
  if (redisCached) {
    const product = JSON.parse(redisCached);
    memoryCache.set(productId, product, 60); // 60 second local cache
    return product;
  }

  // Layer 3: Database (with cache population)
  const product = await db.products.findById(productId);
  if (product) {
    await redis.setex(\