Back to Projects
Distributed systems, message brokering, and horizontal scaling

Distributed Task Queue

A high-throughput distributed task queue built for processing millions of jobs daily with guaranteed delivery and fault tolerance.

DjangoRedisPostgreSQLgRPCDocker

Problem

Existing job processing systems could not handle the required throughput of 2M+ jobs per day while maintaining ordering guarantees and exactly-once semantics.

Solution

Designed a distributed task queue using Redis Streams for message brokering with PostgreSQL as a durable backing store. Implemented consumer groups with automatic rebalancing and dead-letter handling.

Architecture

The system follows a producer-consumer pattern with a coordinator service managing partition assignment. Each worker node processes messages from assigned partitions, with checkpointing to PostgreSQL for recovery. A separate reconciliation service handles failed deliveries.

Key Decisions

  • Chose Redis Streams over Kafka for lower operational complexity at the required scale
  • Implemented idempotency keys at the consumer level to achieve exactly-once processing
  • Used gRPC for inter-service communication to minimize serialization overhead
  • Designed a custom backpressure mechanism to prevent worker overload