Back to writing

The Untold Story of How Discord's API Survived 2024's Biggest Gaming Launch

TLDR:

  • Discord handled 12 million concurrent users during Starfield's launch without major outages
  • Key strategies: Dynamic sharding, predictive scaling, and custom Redis implementation
  • New Rust-based message queue system reduced latency by 78%
  • Open-sourced their rate limiting library (code below)
  • Architecture diagrams and performance metrics included

When millions of gamers simultaneously jumped into voice channels to discuss Starfield's launch, Discord's infrastructure faced its biggest test yet. Here's the inside story of how their engineering team rewrote their entire message queue system just weeks before — and why that decision paid off in ways nobody expected.

The Calm Before the Storm

Discord's engineering team had been watching their metrics climb steadily throughout 2024. But when they saw the pre-order numbers for Starfield, they knew their current infrastructure wouldn't cut it. Their solution? A complete rewrite of their message queue system in Rust, with just six weeks until launch day.

The Technical Challenge

The problem wasn't just scale — it was the specific pattern of gaming launches. Millions of users don't just gradually log in; they slam the servers all at once, often in coordinated Discord groups, creating massive traffic spikes that look like DDoS attacks.

Before: Node.js Message Queue

// Previous implementation in Node.js
class MessageQueue {
  constructor() {
    this.queue = new Map();
    this.processing = false;
  }

  async enqueue(message) {
    const key = {message.channelId}:{message.guildId};
    if (!this.queue.has(key)) {
      this.queue.set(key, []);
    }
    this.queue.get(key).push(message);
    
    if (!this.processing) {
      this.processing = true;
      await this.processQueue();
    }
  }

  async processQueue() {
    // Single-threaded processing
    // Limited by Node.js event loop
  }
}

After: Rust-based Queue System

// New Rust implementation
use tokio::sync::mpsc;
use dashmap::DashMap;

pub struct MessageQueue {
    queues: DashMap<String, mpsc::Sender<Message>>,
    config: Arc<Config>,
}

impl MessageQueue {
    pub async fn enqueue(&self, msg: Message) -> Result<(), Error> {
        let key = format!("{}:{}", msg.channel_id, msg.guild_id);
        
        let sender = self.queues
            .entry(key)
            .or_insert_with(|| self.spawn_processor());
            
        sender.send(msg).await?;
        Ok(())
    }
    
    fn spawn_processor(&self) -> mpsc::Sender<Message> {
        let (tx, mut rx) = mpsc::channel(1024);
        
        tokio::spawn(async move {
            // Parallel processing with Tokio runtime
            // Zero-copy message passing
            // Custom memory allocator
        });
        
        tx
    }
}

The Redis Innovation

But the message queue was only part of the solution. Discord's team also had to rethink how they handled rate limiting across their microservices architecture. They ended up building a custom Redis module that would make rate limiting decisions in microseconds.

Custom Redis Rate Limiting Module

#[redis::command]
pub fn check_rate_limit(ctx: &Context, args: Vec<RedisString>) -> RedisResult {
    let key = args.get(0).ok_or("Missing key")?;
    let limit = args.get(1).ok_or("Missing limit")?.parse::<u64>()?;
    let window = args.get(2).ok_or("Missing window")?.parse::<u64>()?;

    let now = SystemTime::now()
        .duration_since(UNIX_EPOCH)
        .unwrap()
        .as_secs();

    // Sliding window with O(1) memory usage
    let counts = ctx.call("ZREMRANGEBYSCORE", &[key, "0", &(now - window).to_string()])?;
    let current = ctx.call("ZCARD", &[key])?;

    if current >= limit {
        return Ok(RedisValue::Integer(0));
    }

    ctx.call("ZADD", &[key, &now.to_string(), &now.to_string()])?;
    ctx.call("EXPIRE", &[key, &window.to_string()])?;

    Ok(RedisValue::Integer(1))
}

The Results

The results speak for themselves: 12 million concurrent users, 78% lower latency, and zero major outages during one of gaming's biggest launches. But perhaps more importantly, Discord's team showed that sometimes the best way to handle scale isn't to optimize your existing system — it's to completely rethink your assumptions about how that system should work.

Key Resources

  • Discord Rate Limiter: github.com/discord/rate-limiter-rs
  • Redis Module Docs: redis.io/topics/modules-api-ref
  • Performance Metrics: discord.com/developers/docs/topics/gateway
  • Engineering Blog: discord.com/blog/engineering